Production Setup Guide

We maintain two servers: one for production and one for staging:

  • Production server: https://helios.aet.cit.tum.de/

  • Staging server: https://helios-staging.aet.cit.tum.de/

Both environments use the same Compose file:

  • compose.prod.yml is used for both production and staging deployments.

Deployment Strategy

  1. Staging Builds Whenever a push is made to the staging branch, a GitHub Actions workflow triggers a build and deployment to the staging environment.

  2. Production Builds When a release is created in GitHub’s “Releases” section, a build starts for production. - After building the new version, the workflow pauses and waits for approval from a Helios team member before proceeding with deployment.

Before deploying to production, you need to merge the staging branch into the main (production) branch:

git checkout main
git merge --ff-only staging
git push origin main

Deployment Directory

All deployments (both staging and production) happen under:

/opt/helios/

On each server, you will find these files in /opt/helios:

  • compose.prod.yml

  • .env

  • heliosapp.converted_key_pkcs8.pem

  • helios-realm.json

File Descriptions

  • compose.prod.yml Docker Compose file used to build and run all Helios services in production or staging mode.

  • .env Environment variable file.

    • Some values are secret (e.g., service credentials).

    • Some values are configuration settings.

    • This file is overwritten on each deployment by the GitHub Actions workflow.

    • If you add a new environment variable, update the workflow to include it.

    • Secrets and variables are stored in GitHub Environments. In the Helios repository settings, there are two environments—“staging” and “production”—each with 20+ variables already configured.

  • heliosapp.converted_key_pkcs8.pem The PEM file for the GitHub App.

    • Used as credentials when making API requests to GitHub.

    • This file is generated by following the Generate the Private Key step in the Creating a GitHub App.

  • helios-realm.json An exported Keycloak realm configuration.

    • Instead of wiping the database, we export/import Keycloak settings via this file.

    • It contains client IDs, client secrets, login page settings, token exchange rules, etc.

Runtime Containers

A typical production (or staging) environment runs multiple Docker containers under the Helios Compose network. For example, on the staging server

ge89paj@helios-staging:/opt/helios$ docker ps
CONTAINER ID   IMAGE                                                COMMAND                  CREATED          STATUS                    PORTS                                                                                            NAMES
26207bf832fe   ghcr.io/ls1intum/helios/application-server:staging   "java -javaagent:/ap…"   15 minutes ago   Up 15 minutes             0.0.0.0:8080->8080/tcp, :::8080->8080/tcp                                                        helios-application-server-1
a20e1d75dbc0   ghcr.io/ls1intum/helios/keycloak:staging             "/opt/keycloak/bin/k…"   15 minutes ago   Up 15 minutes             8080/tcp, 8443/tcp, 9000/tcp, 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp                          keycloak
b080f449acb6   ghcr.io/ls1intum/helios/notification:staging         "java -javaagent:/ap…"   15 minutes ago   Up 15 minutes             8080/tcp                                                                                         helios-notification-1
d339928ea5c6   ghcr.io/ls1intum/helios/webhook-listener:staging     "uvicorn app.main:ap…"   15 minutes ago   Up 15 minutes             0.0.0.0:4200->4200/tcp, :::4200->4200/tcp                                                        helios-webhook-listener-1
43bba36b647e   ghcr.io/ls1intum/helios/client:staging               "/docker-entrypoint.…"   15 minutes ago   Up 15 minutes             0.0.0.0:90->80/tcp, :::90->80/tcp                                                                helios-client-1
af2a9ccee144   postgres:16                                          "docker-entrypoint.s…"   15 minutes ago   Up 15 minutes             0.0.0.0:5432->5432/tcp, :::5432->5432/tcp                                                        helios-postgres-1
cf206e171655   nats:2.10.26-alpine                                  "docker-entrypoint.s…"   15 minutes ago   Up 15 minutes (healthy)   0.0.0.0:4222->4222/tcp, :::4222->4222/tcp, 0.0.0.0:8222->8222/tcp, :::8222->8222/tcp, 6222/tcp   helios-nats-server-1
1eda53002e85   nginx:latest                                         "/docker-entrypoint.…"   4 weeks ago      Up 15 minutes             0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp                         nginx
a15495daeb2e   gcr.io/cadvisor/cadvisor                             "/usr/bin/cadvisor -…"   2 months ago     Up 2 months (healthy)     0.0.0.0:9111->8080/tcp                                                                           cadvisor

All containers except nginx and cAdvisor are launched by Compose. The Compose file handles:

  • Application Server

  • Keycloak

  • Notification Service

  • Webhook Listener

  • Client (frontend)

  • PostgreSQL

  • NATS Server

Additional Containers

  • cAdvisor - Installed by the ITG admins to feed metrics into Grafana dashboards. - Runs independently; not managed by the Helios Compose file.

  • nginx - Added manually to the same Docker network as the Compose stack. - Created with:

    docker run -d \
      --name nginx \
      --restart unless-stopped \
      -p 80:80 -p 443:443 \
      -v /etc/nginx/conf/nginx.conf:/etc/nginx/nginx.conf:ro \
      -v /var/lib/rbg-cert:/var/lib/rbg-cert:ro \
      --net helios-network \
      nginx:latest
    

    The nginx configuration files for each environment are in the repository root as: nginx.prod.conf and nginx.staging.conf. There is no automation to copy these files to the server; you must manually copy the appropriate file to /etc/nginx/conf/nginx.conf on the server.

    • SSL/TLS Certificates:

      We are using SSL certificates provided by TUM, which are officially issued and valid for 1 year.

      The certificate files are symlinked to auto-generated paths within /var/lib/rbg-cert, and nginx is configured to use them directly. Because these are symlinks, nginx only needs to be restarted once a year—when the certificates are renewed—to pick up the updated files.

      Production

      ssl_certificate     /var/lib/rbg-cert/live/host:f:asevm84.cit.tum.de.fullchain.pem;
      ssl_certificate_key /var/lib/rbg-cert/live/host:f:asevm84.cit.tum.de.privkey.pem;
      

      Staging

      ssl_certificate     /var/lib/rbg-cert/live/host:f:asevm90.cit.tum.de.fullchain.pem;
      ssl_certificate_key /var/lib/rbg-cert/live/host:f:asevm90.cit.tum.de.privkey.pem;
      

      After each deployment from GitHub, the deployment script runs

      docker restart nginx
      

      This ensures that nginx’s internal routing rules and certificate references are reloaded and point to the newly created container IPs.

    Warning

    The renewal process of certificates is handled by the TUM ITG team. Every year, we need to restart the nginx container to apply the new certificates.

    docker restart nginx
    

Helios Network

ge89paj@helios-staging:/opt/helios$ docker network ls
NETWORK ID     NAME             DRIVER    SCOPE
bc2e43954dc6   bridge           bridge    local
c67bf6ea6aa7   helios-network   bridge    local
5180e745d32e   host             host      local
40c45d8673a4   none             null      local

The Compose file defines a custom network named helios-network (see the end of compose.prod.yml). All Helios containers (application server, Keycloak, notification service, webhook listener, client, PostgreSQL, NATS) connect to this network. The manually‐run nginx container must also join helios-network so that it can route traffic to and from these services.

Docker Volumes

ge89paj@helios-staging:/opt/helios$ docker volume ls
DRIVER    VOLUME NAME
local     helios_db-data
local     helios_nats-data
  • helios_db-data: Stores the PostgreSQL database data. Warning: Do not remove this volume, as there is currently no backup of the database.

  • helios_nats-data: Stores NATS JetStream data for event persistence. If you need to reclaim disk space, you can safely remove helios_nats-data; doing so will clear all persisted NATS state, but won’t impact the PostgreSQL data.