Production Setup Guide

We maintain two servers: one for production and one for staging:

  • Production server: https://helios.aet.cit.tum.de/

  • Staging server: https://helios-staging.aet.cit.tum.de/

Both environments use the same Compose file:

  • compose.prod.yml is used for both production and staging deployments.

Deployment Strategy

  1. Staging Builds Whenever a push is made to the staging branch, a GitHub Actions workflow triggers a build and deployment to the staging environment.

  2. Production Builds When a release is created in GitHub’s “Releases” section, a build starts for production. - After building the new version, the workflow pauses and waits for approval from a Helios team member before proceeding with deployment.

Before deploying to production, you need to merge the staging branch into the main (production) branch:

git checkout main
git merge --ff-only staging
git push origin main

Deployment Directory

All deployments (both staging and production) happen under:

/opt/helios/

On each server, you will find these files in /opt/helios:

  • compose.prod.yml

  • .env

  • heliosapp.converted_key_pkcs8.pem

  • helios-realm.json

File Descriptions

  • compose.prod.yml Docker Compose file used to build and run all Helios services in production or staging mode.

  • .env Environment variable file.

    • Some values are secret (e.g., service credentials).

    • Some values are configuration settings.

    • This file is overwritten on each deployment by the GitHub Actions workflow.

    • If you add a new environment variable, update the workflow to include it.

    • Secrets and variables are stored in GitHub Environments. In the Helios repository settings, there are two environments—“staging” and “production”—each with 20+ variables already configured.

  • heliosapp.converted_key_pkcs8.pem The PEM file for the GitHub App.

    • Used as credentials when making API requests to GitHub.

    • This file is generated by following the Generate the Private Key step in the Creating a GitHub App.

  • helios-realm.json An exported Keycloak realm configuration.

    • Instead of wiping the database, we export/import Keycloak settings via this file.

    • It contains client IDs, client secrets, login page settings, token exchange rules, etc.

Runtime Containers

A typical production (or staging) environment runs multiple Docker containers under the Helios Compose network. For example, on the staging server

ge89paj@helios-staging:/opt/helios$ docker ps
CONTAINER ID   IMAGE                                                COMMAND                  CREATED          STATUS                    PORTS                                                                                            NAMES
26207bf832fe   ghcr.io/ls1intum/helios/application-server:staging   "java -javaagent:/ap…"   15 minutes ago   Up 15 minutes             0.0.0.0:8080->8080/tcp, :::8080->8080/tcp                                                        helios-application-server-1
a20e1d75dbc0   ghcr.io/ls1intum/helios/keycloak:staging             "/opt/keycloak/bin/k…"   15 minutes ago   Up 15 minutes             8080/tcp, 8443/tcp, 9000/tcp, 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp                          keycloak
b080f449acb6   ghcr.io/ls1intum/helios/notification:staging         "java -javaagent:/ap…"   15 minutes ago   Up 15 minutes             8080/tcp                                                                                         helios-notification-1
d339928ea5c6   ghcr.io/ls1intum/helios/webhook-listener:staging     "uvicorn app.main:ap…"   15 minutes ago   Up 15 minutes             0.0.0.0:4200->4200/tcp, :::4200->4200/tcp                                                        helios-webhook-listener-1
43bba36b647e   ghcr.io/ls1intum/helios/client:staging               "/docker-entrypoint.…"   15 minutes ago   Up 15 minutes             0.0.0.0:90->80/tcp, :::90->80/tcp                                                                helios-client-1
af2a9ccee144   postgres:16                                          "docker-entrypoint.s…"   15 minutes ago   Up 15 minutes             0.0.0.0:5432->5432/tcp, :::5432->5432/tcp                                                        helios-postgres-1
cf206e171655   nats:2.10.26-alpine                                  "docker-entrypoint.s…"   15 minutes ago   Up 15 minutes (healthy)   0.0.0.0:4222->4222/tcp, :::4222->4222/tcp, 0.0.0.0:8222->8222/tcp, :::8222->8222/tcp, 6222/tcp   helios-nats-server-1
1eda53002e85   nginx:latest                                         "/docker-entrypoint.…"   4 weeks ago      Up 15 minutes             0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp                         nginx
a15495daeb2e   gcr.io/cadvisor/cadvisor                             "/usr/bin/cadvisor -…"   2 months ago     Up 2 months (healthy)     0.0.0.0:9111->8080/tcp                                                                           cadvisor

All containers except nginx and cAdvisor are launched by Compose. The Compose file handles:

  • Application Server

  • Keycloak

  • Notification Service

  • Webhook Listener

  • Client (frontend)

  • PostgreSQL

  • NATS Server

Additional Containers

  • cAdvisor - Installed by the ITG admins to feed metrics into Grafana dashboards. - Runs independently; not managed by the Helios Compose file.

  • nginx - Added manually to the same Docker network as the Compose stack. - Created with:

    docker run -d \
      --name nginx \
      --restart unless-stopped \
      -p 80:80 -p 443:443 \
      -v /etc/nginx/conf/nginx.conf:/etc/nginx/nginx.conf:ro \
      -v /etc/nginx/certs:/etc/nginx/certs:ro \
      --net helios-network \
      nginx:latest
    
    • SSL/TLS Certificates:

    Warning

    Do not forget to renew the certificates for both production and staging environments every 90 days!

    Certificates are generated manually using Certbot. For example:

    sudo certbot certonly --standalone -d helios.aet.cit.tum.de
    

    This creates certificate files under /etc/letsencrypt. After generating the certificates, update the nginx configuration file at /etc/nginx/conf/nginx.conf to reference the new certificate and key files. Typical SSL snippet in nginx.conf:

    server {
        listen 443 ssl;
        server_name helios.aet.cit.tum.de;
    
        ssl_certificate     /etc/letsencrypt/live/helios.aet.cit.tum.de/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/helios.aet.cit.tum.de/privkey.pem;
    
        # ... other configuration ...
    }
    

    Note that a reference version of the nginx configuration lives in the repository’s root as nginx.conf—however, to see the live, up-to-date configuration in use, refer to the file at /etc/nginx/conf/nginx.conf.

    • Important: After each deployment (docker-compose up), the deployment script runs:

      docker restart nginx
      

      This ensures that nginx’s internal routing rules and certificate references are reloaded and point to the newly created container IPs.

Helios Network

ge89paj@helios-staging:/opt/helios$ docker network ls
NETWORK ID     NAME             DRIVER    SCOPE
bc2e43954dc6   bridge           bridge    local
c67bf6ea6aa7   helios-network   bridge    local
5180e745d32e   host             host      local
40c45d8673a4   none             null      local

The Compose file defines a custom network named helios-network (see the end of compose.prod.yml). All Helios containers (application server, Keycloak, notification service, webhook listener, client, PostgreSQL, NATS) connect to this network. The manually‐run nginx container must also join helios-network so that it can route traffic to and from these services.

Docker Volumes

ge89paj@helios-staging:/opt/helios$ docker volume ls
DRIVER    VOLUME NAME
local     helios_db-data
local     helios_nats-data
  • helios_db-data: Stores the PostgreSQL database data. Warning: Do not remove this volume, as there is currently no backup of the database.

  • helios_nats-data: Stores NATS JetStream data for event persistence. If you need to reclaim disk space, you can safely remove helios_nats-data; doing so will clear all persisted NATS state, but won’t impact the PostgreSQL data.