Skip to main content

Production Setup

This guide covers the minimum configuration required to deploy Hephaestus in production. It assumes you are familiar with Docker Compose or Kubernetes and have access to the protected secrets managed by TUM.

Platform overview

The production stack consists of:

  • Application server (Spring Boot, runs the Pi mentor agent in-process)
  • Webhook server (same Spring Boot artifact, webhook profile → NATS JetStream)
  • React webapp (served via nginx)
  • PostgreSQL 16
  • Hephaestus-native auth (Spring Security oauth2Login federating to GitHub + optional GitLab — gitlab.com or self-hosted — issuing cookie-session JWTs — ADR 0017)

Environment variables

Set the following secrets before starting the stack:

VariablePurpose
HEPHAESTUS_SECURITY_ENCRYPTION_KEYRequired. AES-256 key (EXACTLY 32 chars) for credentials encrypted at rest in the connection table. Must be set before the first deploy of the unified integration framework — the Liquibase backfill re-encrypts existing credentials on boot and fails fast (or HALTs the migration) without it. Keep it stable; rotating requires re-encrypting all rows. Generate: openssl rand -base64 24 | cut -c1-32.
WEBHOOK_SECRETHMAC secret for GitHub and GitLab webhooks (prefer SHA-256; e.g., openssl rand -hex 32)
GITHUB_OAUTH_CLIENT_IDGitHub OAuth app client ID (callback https://<host>/api/login/oauth2/code/github)
GITHUB_OAUTH_CLIENT_SECRETGitHub OAuth app client secret
HEPHAESTUS_AUTH_ISSUERPublic issuer origin, e.g. https://<host> (sets the JWT iss + /.well-known issuer)
HEPHAESTUS_AUTH_STATE_COOKIE_KEYBase64 32-byte AES key sealing the OAuth state cookies (openssl rand -base64 32)
HEPHAESTUS_INTEGRATION_SLACK_ENABLEDOptional: enables the Slack OAuth admin surface (per-workspace connect)
HEPHAESTUS_INTEGRATION_SLACK_CLIENT_IDOptional: Slack app client ID used by the OAuth flow
HEPHAESTUS_INTEGRATION_SLACK_CLIENT_SECRETOptional: Slack app client secret used by the OAuth flow
HEPHAESTUS_INTEGRATION_SLACK_REDIRECT_URIOptional: Slack OAuth callback URL (typically https://<host>/oauth/callback/slack)
HEPHAESTUS_INTEGRATION_OAUTH_SUCCESS_REDIRECTOptional: post-callback landing URL on success. Defaults to /integrations?status=success. Set the absolute URL when the SPA host differs from the API host.
HEPHAESTUS_INTEGRATION_OAUTH_FAILURE_REDIRECTOptional: post-callback landing URL on failure. Defaults to the success base + ?status=error.
LEGAL_PROFILESelects a bundled imprint/privacy profile. Use tumaet only for the canonical AET deployment; self-hosters leave empty and mount /legal-overrides/. See Legal Pages.

Per-workspace bot tokens are issued via OAuth and encrypted at rest in the connection table; there is no longer a global bot token.

GitHub integration

Hephaestus can connect to GitHub using either a Personal Access Token (PAT) or a GitHub App:

  • Personal Access Token: Simpler to set up. Configure tokens through the workspace UI after deployment. No additional environment variables required.
  • GitHub App (optional): Provides more granular permissions and better rate limits. If using a GitHub App, set:
    • GH_APP_ID: Your GitHub App ID
    • GH_APP_PRIVATE_KEY: Your GitHub App private key (PEM format)
    • GH_APP_PRIVATE_KEY_LOCATION: Alternative path to private key file (optional)
    • GH_APP_INSTALLATION_URL: Install URL shown in the workspace creation wizard

The application works with either approach. If GitHub App credentials are not provided (defaults to GH_APP_ID=0), workspaces will use Personal Access Tokens instead.

GitLab rollout bundle

GitLab login, GitLab workspaces, webhook auto-registration, and practice review are not independent toggles. Treat the following as a rollout bundle and enable them together when rolling out GitLab practice review:

VariablePurpose
GITLAB_OAUTH_CLIENT_IDGitLab OAuth application client ID (callback https://<host>/api/login/oauth2/code/gitlab)
GITLAB_OAUTH_CLIENT_SECRETGitLab OAuth application client secret
GITLAB_OAUTH_BASE_URLGitLab instance the login button federates to. Defaults to https://gitlab.com; set to your self-hosted instance (e.g. https://gitlab.lrz.de)
GITLAB_OAUTH_DISPLAY_NAMELogin-button label (defaults to GitLab; e.g. gitlab.lrz.de)
GITLAB_DEFAULT_SERVER_URLDefault GitLab instance for workspace creation / SCM sync (not auth). Falls back to GITLAB_OAUTH_BASE_URL
GITLAB_ENABLEDEnables server-side GitLab beans in the application server
GITLAB_WORKSPACE_CREATIONEnables GitLab workspace creation in the UI and API
WEBHOOK_SECRETShared secret used for GitLab webhook verification and auto-registration
WEBHOOK_EXTERNAL_URLPublic webhook base URL registered on GitLab

The login provider is instance-agnostic: gitlab.com works out of the box, and any self-hosted GitLab works by pointing GITLAB_OAUTH_BASE_URL at it. Create the GitLab OAuth application on that instance with these settings:

  • Redirect URI: https://<hostname>/api/login/oauth2/code/gitlab · Scope: read_user
  • Confidential client: enabled

This env-seeds one default GitLab login provider (registration id gitlab). Additional GitLab instances are added at runtime by an instance admin under Instance admin → Login providers — no redeploy.

Practice review rollout bundle

Practice review is not enabled by a single flag. The following settings must be aligned:

VariablePurpose
PRACTICE_REVIEW_FOR_ALLEnables the feature flag and the detection gate for all users
PRACTICE_REVIEW_SKIP_DRAFTSSkips draft PRs and draft merge requests
PRACTICE_REVIEW_DELIVER_TO_MERGEDAllows delivery after merge
PRACTICE_REVIEW_COOLDOWN_MINUTESMinimum delay between repeated reviews of the same PR/MR
PRACTICE_REVIEW_APP_BASE_URLFooter link in review comments
SANDBOX_ENABLEDEnables the Docker sandbox used by the coding agent
AGENT_NATS_ENABLEDEnables the agent job queue and executor
GIT_CHECKOUT_ENABLEDEnables local repo checkout and bind-mount into agent containers
NATS_ENABLEDEnables webhook-driven sync consumption
NATS_DURABLE_CONSUMER_NAMEDurable consumer name for sync processing

SANDBOX_ENABLED, AGENT_NATS_ENABLED, GIT_CHECKOUT_ENABLED, and NATS_ENABLED must be true together for practice review. Anything else is a half-configured deployment.

Rollout tiers

Use the same variables differently across environments:

EnvironmentIntended scope
PreviewLimited by default. Keep GitLab login, GitLab workspaces, sandbox, git checkout, and agent job execution disabled unless preview is explicitly being used as rollout validation.
StagingUses the production compose files, but should still be a controlled rollout. Start by enabling GitLab login and GitLab workspace creation first. Only enable SANDBOX_ENABLED, AGENT_NATS_ENABLED, GIT_CHECKOUT_ENABLED, and NATS_ENABLED when staging is intentionally validating practice review execution.
ProductionFull rollout only after staging has validated the exact same bundle.

Staging and production both use the production compose files. The difference should come from the .env values, not from a different compose topology.

AI model configuration

The mentor (Pi agent run in-process) and the sandboxed practice-review proxy both flow through the application server's LLM proxy routes. There is no separate intelligence service to configure.

For Azure-backed practice review in the application server:

VariablePurpose
LLM_PROXY_AZURE_OPENAI_URLAzure OpenAI resource URL used by the dedicated Azure proxy route
LLM_PROXY_AZURE_OPENAI_AUTH_HEADERUsually api-key
LLM_PROXY_AZURE_OPENAI_USE_BEARERUsually false for Azure OpenAI

The generic LLM_PROXY_OPENAI_* settings only affect the OpenAI-compatible proxy route. The sandboxed Pi agent uses the dedicated Azure route when configured for Azure OpenAI.

Webhook secret: Set WEBHOOK_SECRET to the secret configured on your GitHub and GitLab webhooks. Prefer SHA-256 (X-Hub-Signature-256) and generate with openssl rand -hex 32 (base64 also works).

Data sync and backfill

The sync scheduler fetches recent GitHub activity (issues, PRs, reviews) for monitored repositories. Backfill optionally syncs historical data in batches to avoid rate limit exhaustion.

VariableDefaultPurpose
MONITORING_RUN_ON_STARTUPtrueRun initial sync when application starts
MONITORING_TIMEFRAME7Days of recent activity to sync each cycle
MONITORING_SYNC_CRON0 0 * * * *Cron schedule for sync (hourly by default)
MONITORING_SYNC_COOLDOWN_IN_MINUTES60Minimum gap between syncs per repository
MONITORING_BACKFILL_ENABLEDtrueEnable historical data backfill
MONITORING_BACKFILL_BATCH_SIZE50Issues/PRs per backfill batch
MONITORING_BACKFILL_RATE_LIMIT_THRESHOLD500Skip backfill if rate limit below this
MONITORING_BACKFILL_INTERVAL_SECONDS60Interval between backfill batches

Backfill behavior: After recent sync completes, backfill works backwards from the highest issue number, syncing in small batches. It pauses when rate limits drop below the threshold and resumes on the next cycle. Progress is checkpointed per repository.

Deployment steps

  1. Provision infrastructure: Ensure PostgreSQL, Redis (if used), and storage volumes are ready.
  2. Configure authentication:
    • Copy server/.env.example to server/.env, then populate GITHUB_OAUTH_CLIENT_ID, GITHUB_OAUTH_CLIENT_SECRET, HEPHAESTUS_AUTH_ISSUER, and HEPHAESTUS_AUTH_STATE_COOKIE_KEY. The GitHub OAuth app's callback must be https://<host>/api/login/oauth2/code/github.
    • Confirm the GitHub identity provider requests the user:email scopes and disable username/password login flows to keep authentication SSO-only.
    • If rolling out GitLab login, set GITLAB_OAUTH_CLIENT_ID/SECRET (callback .../api/login/oauth2/code/gitlab), plus GITLAB_OAUTH_BASE_URL (defaults to https://gitlab.com; point it at a self-hosted instance such as https://gitlab.lrz.de) and optionally GITLAB_OAUTH_DISPLAY_NAME. This env-seeds the default GitLab login provider on first boot. Additional GitLab instances are added at runtime by an instance admin under Instance admin → Login providers (one OAuth app per instance) — no redeploy. The screen shows the exact redirect URI to register on the upstream GitLab OAuth app.
  3. Bootstrap secrets: Load environment variables into your secret manager or .env files consumed by Docker/Kubernetes.
  4. Deploy services: Use the provided Compose files (docker/compose.app.yaml, docker/compose.core.yaml, docker/compose.proxy.yaml) or your Kubernetes manifests.
  5. Run database migrations: The application server runs Liquibase migrations on startup; monitor logs to confirm success.
  6. Verify integrations:
    • Designate the first super-admin before first boot via the allowlist HEPHAESTUS_AUTH_BOOTSTRAP_ADMINS=<provider>:@<username> (e.g. github:@octocat), or use the one-time break-glass HEPHAESTUS_AUTH_BOOTSTRAP_TOKEN — both are idempotent and lockout-safe (see the docs/runbooks/auth-cutover.md runbook → First instance admin). They promote on login; subsequent admins are managed from /admin/users. Hand-editing account.app_role = 'APP_ADMIN' in SQL is a last-resort fallback only.
    • Verify that the super admin user can access workspace admin endpoints for workspaces where they have membership (they are automatically elevated to workspace ADMIN level for those workspaces).
    • Trigger a test webhook from GitHub or GitLab to validate the ingest pipeline.
    • Verify that GITLAB_ENABLED, GITLAB_WORKSPACE_CREATION, SANDBOX_ENABLED, AGENT_NATS_ENABLED, GIT_CHECKOUT_ENABLED, and NATS_ENABLED are all present in the deployed environment before testing GitLab practice review.
    • Open /imprint and /privacy. If the red "not configured" banner is shown, configure a legal profile or overrides before opening the deployment to end users — serving the built-in disclaimer is a § 5 DDG / Art. 13 GDPR violation.

Global admin privileges

Accounts with app_role = 'APP_ADMIN' carry the admin authority in their JWT. This grants them:

  • Automatic workspace ADMIN privileges for workspaces where they have membership (they are auto-elevated to workspace ADMIN level)
  • Ability to manage workspace settings, members, and repositories in their member workspaces
  • Cannot perform OWNER-only operations (e.g., workspace ownership transfer) unless explicitly granted the OWNER role in that workspace
  • Must have workspace membership (any role: OWNER, ADMIN, or MEMBER) to access the workspace

This allows platform administrators to troubleshoot and manage workspaces where they are members, with automatic admin privileges, without needing explicit ADMIN role assignment in the database.

Operational tips

  • Monitor services with the central Prometheus/Loki stack; ensure trace IDs appear in logs.
  • Schedule regular backups for PostgreSQL (it now holds auth state — accounts, identity links, sessions — in addition to application data).
  • Review weekly leaderboard Slack posts to ensure the automation is active.
  • Preview deployments do not currently represent the full GitLab practice-review rollout unless they are explicitly wired with the same GitLab IdP, sandbox, git checkout, and agent NATS settings.

Support

Contact the Hephaestus core team if you need to rotate secrets or migrate infrastructure. Document any deviations from this checklist in the deployment runbook.