Production VM Guide
We maintain two servers: one for production and one for staging:
Production server:
https://helios.aet.cit.tum.de/
Staging server:
https://helios-staging.aet.cit.tum.de/
Only required package is Docker. The production server runs the latest stable version of Helios (main
branch), while the staging server runs the latest development version (staging
branch).
Both environments use the same Compose file:
compose.prod.yml
is used for both production and staging deployments.
Deployment Strategy
Staging Builds Whenever a push is made to the
staging
branch, a GitHub Actions workflow triggers a build and deployment to the staging environment.Production Builds When a release is created in GitHub’s “Releases” section, a build starts for production. - After building the new version, the workflow pauses and waits for approval from a Helios team member before proceeding with deployment.
Before deploying to production, you need to merge the staging
branch into the main
(production) branch:
git checkout main
git merge --ff-only staging
git push origin main
Deployment Directory
All deployments (both staging and production) happen under:
/opt/helios/
On each server, you will find these files in /opt/helios
:
compose.prod.yml
.env
heliosapp.converted_key_pkcs8.pem
helios-realm.json
File Descriptions
compose.prod.yml
Docker Compose file used to build and run all Helios services in production or staging mode..env
Environment variable file.Some values are secret (e.g., service credentials).
Some values are configuration settings.
This file is overwritten on each deployment by the GitHub Actions workflow.
If you add a new environment variable, update the workflow to include it.
Secrets and variables are stored in GitHub Environments. In the Helios repository settings, there are two environments—“staging” and “production”—each with 20+ variables already configured.
heliosapp.converted_key_pkcs8.pem
The PEM file for the GitHub App.Used as credentials when making API requests to GitHub.
This file is generated by following the Generate the Private Key step in the Creating a GitHub App.
helios-realm.json
An exported Keycloak realm configuration.Instead of wiping the database, we export/import Keycloak settings via this file.
It contains client IDs, client secrets, login page settings, token exchange rules, etc.
Environment Variables
The .env
file in /opt/helios
contains all environment variables for production/staging deployments. GitHub Actions fills this file during deployment.
Below is the complete list of variables, their purpose, and where they are used.
Core Infrastructure
ENVIRONMENT
Environment identifier (prod
,staging
,dev
). Status: Unused in compose (informational only).POSTGRES_DB
Name of the PostgreSQL database for the application. Used by:postgres
(also Keycloak indirectly via DB creation).POSTGRES_PASSWORD
PostgreSQL password. Used by:postgres
,keycloak
.POSTGRES_USER
PostgreSQL username. Used by:postgres
,keycloak
.SPRING_PROFILES_ACTIVE
Spring Boot active profile (prod
for production). Used by:application-server
,notification
.DATASOURCE_URL
JDBC connection string to PostgreSQL. Used by:application-server
.DATASOURCE_USERNAME
DB username for application server JDBC. Should matchPOSTGRES_USER
. Used by:application-server
.DATASOURCE_PASSWORD
DB password for application server JDBC. Should matchPOSTGRES_PASSWORD
. Used by:application-server
.
NATS Messaging
NATS_SERVER
Host:port of NATS server. Used by:application-server
,notification
.NATS_AUTH_TOKEN
Token for authenticating with NATS. Used by:nats-server
,webhook-listener
,application-server
,notification
.NATS_DURABLE_CONSUMER_NAME
Durable consumer name for message replay. Used by:application-server
(value fornotification
is written hardcoded incompose.prod.yaml
asnotification-consumer
).NATS_CONSUMER_INACTIVE_THRESHOLD_MINUTES
Consumer inactivity threshold. Used by:application-server
,notification
.NATS_CONSUMER_ACK_WAIT_SECONDS
Ack wait time for durable consumers. Used by:application-server
,notification
.
GitHub App / Repository Sync
WEBHOOK_SECRET
HMAC secret for GitHub webhook validation. Used by:webhook-listener
.REPOSITORY_NAME
Comma-separated list of repositories to sync. This value can be empty since all the repositories which install the GitHub App will be synced automatically. Used by:application-server
.ORGANIZATION_NAME
GitHub organization name for auto-detection of installation ID. Used by:application-server
. Note: Set this value and leaveGITHUB_INSTALLATION_ID
empty for auto-detection of the GitHub App installation ID.GITHUB_AUTH_TOKEN
GitHub Personal Access Token (if not using GitHub App, we are right now using the GitHub App, so leave this empty). Used by:application-server
.RUN_ON_STARTUP_COOLDOWN
Minimum minutes since last sync to run sync on startup. Used by:application-server
.SENTRY_DSN
Sentry DSN for error reporting. Used by:application-server
.DATA_SYNC_RUN_ON_STARTUP
Whether to run repository sync on startup. Deploying a new version takes couple of minutes, setting this value tofalse``is safe since syncing takes quite some time and we do not want to run it on every deployment. *Used by:* ``application-server
.GITHUB_APP_NAME
GitHub App URL-safe name. Used by:application-server
.GITHUB_APP_ID
Numeric ID of GitHub App. Used by:application-server
.GITHUB_CLIENT_ID
OAuth Client ID for GitHub App. Used by:application-server
.GITHUB_INSTALLATION_ID
GitHub App installation ID. Empty if auto-detecting. Used by:application-server
.GITHUB_PRIVATE_KEY_PATH
Path to PKCS#8-formatted GitHub App private key. Used by:application-server
.
Authentication / Keycloak
KC_BOOTSTRAP_ADMIN_USERNAME
Initial Keycloak admin username. Used by:keycloak
.KC_BOOTSTRAP_ADMIN_PASSWORD
Initial Keycloak admin password. Used by:keycloak
.KC_HOSTNAME
Public hostname for Keycloak. Used by:keycloak
.KC_HTTP_ENABLED
Whether to enable HTTP in Keycloak. Used by:keycloak
.OAUTH_ISSUER_URL
Keycloak realm issuer URL. Used by:application-server
.HELIOS_TOKEN_EXCHANGE_CLIENT
Keycloak client ID for token exchange. Used by:application-server
.HELIOS_TOKEN_EXCHANGE_SECRET
Keycloak client secret for token exchange. Used by:application-server
.
Notification Service
MAIL_HOST
SMTP host for sending emails. Used by:notification
.MAIL_PORT
SMTP port. Used by:notification
.EMAIL_ENABLED
Enable/disable email sending. Used by:notification
.EMAIL_FROM
Sender email address. Used by:notification
.
Other Application Server Settings
CLEANUP_WORKFLOW_RUN_DRY_RUN
If true, cleanup workflow runs in dry-run mode. Used by:application-server
.HELIOS_ENVIRONMENT_NAME
Used for push-based status updates to Helios. Used by:application-server
.HELIOS_PROD_SECRET_KEY
Used for push-based status updates to Helios. Used by:application-server
.HELIOS_STAGING_SECRET_KEY
Used for push-based status updates to Helios. Used by:application-server
.
Runtime Containers
A typical production (or staging) environment runs multiple Docker containers under the Helios Compose network. For example, on the staging server
ge89paj@helios-staging:/opt/helios$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
26207bf832fe ghcr.io/ls1intum/helios/application-server:staging "java -javaagent:/ap…" 15 minutes ago Up 15 minutes 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp helios-application-server-1
a20e1d75dbc0 ghcr.io/ls1intum/helios/keycloak:staging "/opt/keycloak/bin/k…" 15 minutes ago Up 15 minutes 8080/tcp, 8443/tcp, 9000/tcp, 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp keycloak
b080f449acb6 ghcr.io/ls1intum/helios/notification:staging "java -javaagent:/ap…" 15 minutes ago Up 15 minutes 8080/tcp helios-notification-1
d339928ea5c6 ghcr.io/ls1intum/helios/webhook-listener:staging "uvicorn app.main:ap…" 15 minutes ago Up 15 minutes 0.0.0.0:4200->4200/tcp, :::4200->4200/tcp helios-webhook-listener-1
43bba36b647e ghcr.io/ls1intum/helios/client:staging "/docker-entrypoint.…" 15 minutes ago Up 15 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp helios-client-1
af2a9ccee144 postgres:16 "docker-entrypoint.s…" 15 minutes ago Up 15 minutes 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp helios-postgres-1
cf206e171655 nats:2.10.26-alpine "docker-entrypoint.s…" 15 minutes ago Up 15 minutes (healthy) 0.0.0.0:4222->4222/tcp, :::4222->4222/tcp, 0.0.0.0:8222->8222/tcp, :::8222->8222/tcp, 6222/tcp helios-nats-server-1
1eda53002e85 nginx:latest "/docker-entrypoint.…" 4 weeks ago Up 15 minutes 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp nginx
a15495daeb2e gcr.io/cadvisor/cadvisor "/usr/bin/cadvisor -…" 2 months ago Up 2 months (healthy) 0.0.0.0:9111->8080/tcp cadvisor
All containers except nginx and cAdvisor are launched by Compose. The Compose file handles:
Application Server
Keycloak
Notification Service
Webhook Listener
Client (frontend)
PostgreSQL
NATS Server
Additional Containers
cAdvisor - Installed by the ITG admins to feed metrics into Grafana dashboards. - Runs independently; not managed by the Helios Compose file.
nginx - Added manually to the same Docker network as the Compose stack. - Created with:
docker run -d \ --name nginx \ --restart unless-stopped \ -p 80:80 -p 443:443 \ -v /etc/nginx/conf/nginx.conf:/etc/nginx/nginx.conf:ro \ -v /var/lib/rbg-cert:/var/lib/rbg-cert:ro \ --net helios-network \ nginx:latest
The nginx configuration files for each environment are in the repository root as:
nginx.prod.conf
andnginx.staging.conf
. There is no automation to copy these files to the server; you must manually copy the appropriate file to/etc/nginx/conf/nginx.conf
on the server.SSL/TLS Certificates:
We are using SSL certificates provided by TUM, which are officially issued and valid for 1 year.
The certificate files are symlinked to auto-generated paths within
/var/lib/rbg-cert
, and nginx is configured to use them directly. Because these are symlinks,nginx
only needs to be restarted once a year—when the certificates are renewed—to pick up the updated files.Production
ssl_certificate /var/lib/rbg-cert/live/host:f:asevm84.cit.tum.de.fullchain.pem; ssl_certificate_key /var/lib/rbg-cert/live/host:f:asevm84.cit.tum.de.privkey.pem;
Staging
ssl_certificate /var/lib/rbg-cert/live/host:f:asevm90.cit.tum.de.fullchain.pem; ssl_certificate_key /var/lib/rbg-cert/live/host:f:asevm90.cit.tum.de.privkey.pem;
After each deployment from GitHub, the deployment script runs
docker restart nginx
This ensures that nginx’s internal routing rules and certificate references are reloaded and point to the newly created container IPs.
Warning
The renewal process of certificates is handled by the TUM ITG team. Every year, we need to restart the nginx container to apply the new certificates.
docker restart nginx
Helios Network
ge89paj@helios-staging:/opt/helios$ docker network ls
NETWORK ID NAME DRIVER SCOPE
bc2e43954dc6 bridge bridge local
c67bf6ea6aa7 helios-network bridge local
5180e745d32e host host local
40c45d8673a4 none null local
The Compose file defines a custom network named helios-network
(see the end of compose.prod.yml
). All Helios containers (application server, Keycloak, notification service, webhook listener, client, PostgreSQL, NATS) connect to this network. The manually‐run nginx container must also join helios-network
so that it can route traffic to and from these services.
Docker Volumes
ge89paj@helios-staging:/opt/helios$ docker volume ls
DRIVER VOLUME NAME
local helios_db-data
local helios_nats-data
helios_db-data
: Stores the PostgreSQL database data. Warning: Do not remove this volume, as there is currently no backup of the database.helios_nats-data
: Stores NATS JetStream data for event persistence. If you need to reclaim disk space, you can safely removehelios_nats-data
; doing so will clear all persisted NATS state, but won’t impact the PostgreSQL data.