Testing Guide
Testing keeps regression risk low across services. The application server uses Maven with dedicated base classes for unit and integration tests.
Quick start
-
Decide whether you need a fast unit test or a Spring-backed integration test.
-
Run the appropriate command:
# Unit tests only (fast)./mvnw test# Full suite including integration tests./mvnw verify -
Extend the correct base class:
// Unit testclass UserServiceTest extends BaseUnitTest { }// Integration testclass UserServiceIntegrationTest extends BaseIntegrationTest { }
:::warning Naming matters
Use *Test.java for unit tests and *IntegrationTest.java for integration tests. Maven's surefire and failsafe plugins use these patterns.
:::
Why testing?
Tests catch regressions before they hit production, document expected behaviour, and unlock safe refactors. Our setup keeps the feedback loop fast – unit tests finish in under a second, while the shared Testcontainers environment limits the cost of integration suites.
Unit tests
- Extend
BaseUnitTestfor isolated component tests. - Use Mockito (
@Mock,@InjectMocks) for dependencies. - Expect sub-second runtime – no Spring context is loaded.
class UserServiceTest extends BaseUnitTest {
@Mock private UserRepository repository;
@InjectMocks private UserService service;
@Test
@DisplayName("Should validate email format")
void shouldValidateEmailFormat() {
// Fast isolated test
}
}
Integration tests
- Extend
BaseIntegrationTestfor cross-component behaviour. - Relies on a shared PostgreSQL Testcontainers instance (significantly faster than per-test containers).
- Provides annotations such as
@GitHubPayload("label.created")to inject recorded webhook payloads.
Example:
class GitHubLabelMessageHandlerIntegrationTest extends BaseIntegrationTest {
@Test
@DisplayName("Should persist label from webhook")
void shouldPersistLabel(GHEventPayload.Label payload) {
// Test with real webhook data
}
}
Controller integration tests
Spring's own docs and battle-tested community guides (see rieckpil.de and Baeldung) emphasise two things:
- Test controllers at the HTTP boundary to catch security misconfigurations.
- Keep fixtures reusable so every new endpoint does not need bespoke bootstrapping.
To make that concrete for Hephaestus:
- Extend
AbstractWorkspaceIntegrationTest(or a service-specific base) so every controller IT automatically cleans the database, provides helper factories such aspersistUser, and exposesworkspaceServicefor setup. - Use
WebTestClientinstead ofMockMvc. We already run the full Spring Boot context, which lets us verify access control rules (401unauthenticated vs403non-admin) exactly as delivered to the frontend. - Reach for the custom security annotations (
@WithAdminUser,@WithUser) together withTestAuthUtils.withCurrentUser()so the generated JWT matches our resource-server configuration. - Assert on status code, payload shape, and repository side effects. A minimal test covers "happy path", "validation failure", and at least one access-control branch.
- Keep factories close to the domain: prefer
createWorkspace("slug", "Display", ...)over manually seeding repositories, which keeps each test focused on behaviour instead of boilerplate.
Example skeleton:
@AutoConfigureWebTestClient
class FooControllerIntegrationTest extends AbstractWorkspaceIntegrationTest {
@Autowired
private WebTestClient webTestClient;
@Test
@WithAdminUser
void endpointHappyPath() {
User owner = persistUser("owner");
Workspace workspace = createWorkspace("slug", "Display", "login", AccountType.ORG, owner);
webTestClient
.get()
.uri("/workspaces/{workspaceSlug}", workspace.getWorkspaceSlug())
.headers(TestAuthUtils.withCurrentUser())
.exchange()
.expectStatus()
.isOk();
}
}
Documenting each new controller integration test in this format keeps future work predictable and minimises copy/paste.
Webhook fixtures and tooling
Reusable webhook JSON lives in src/test/resources/github. Use @GitHubPayload("event.name") to inject them, or extract new samples with:
pnpm run nats:extract-examples
Available examples include label.created, repository.created, create, push, and more – consult the folder before recording new payloads.
Maven recipes
./mvnw test # Unit tests
./mvnw verify # Full suite + packaging
./mvnw test -Dtest=UserServiceTest # Single test class
./mvnw test -Dtest=UserServiceTest#methodName # Single test method
Live external-service tests
Some regressions can only be validated against real systems — the GitHub API, a real LLM endpoint, a Docker daemon. Live tests are tagged @Tag("live") and excluded from mvn test / mvn verify; they only run under mvn test -Plive-tests. The profile flips Surefire's group filter and adds failIfNoSpecifiedTests=false so single-test invocations don't fail when the gating env var is unset.
Meta-annotation cheat sheet
Use one of these composite annotations on the test class (or method) rather than wiring @Tag("live") + @EnabledIfEnvironmentVariable by hand. They keep gates uniform and put the env-var contract in one place.
| Annotation | Gate | Use for |
|---|---|---|
@LiveLlmTest | HEPHAESTUS_LIVE_LLM_API_KEY | Tests that drive a real LLM. Defaults to the TUM AET ASE gateway running openai/gpt-oss-120b; override with HEPHAESTUS_LIVE_LLM_BASE_URL / HEPHAESTUS_LIVE_LLM_MODEL. Read via LiveLlmCredentials.fromEnv(). |
@LiveGitHubTest | GH_APP_INSTALLATION_ID | Tests that hit github.com via the installed GitHub App. BaseGitHubLiveIntegrationTest runs an extra runtime check on private-key material. |
@LiveGitLabTest | HEPHAESTUS_LIVE_GITLAB_TOKEN | Reserved for GitLab-side replay/live-sync tests. |
@LiveDockerTest | none — runtime probe in @BeforeAll | Tests that need a Docker daemon. Combine with @LiveLlmTest when the sandbox payload also needs LLM credentials. |
Each annotation pins JUnit to Execution(SAME_THREAD) so two live tests in the same module never race on rate limits, log interleaving, or shared temp dirs.
Authoring a new live test
- Pick the right meta-annotation. If the env var doesn't exist yet, add a new annotation alongside the existing four — don't reuse one with a misleading name.
- Put the test under
src/test/java/.../live/. The directory marker keeps live and non-live tests in different code-review buckets. - Read credentials from
LiveLlmCredentials.fromEnv()(or the GitHub helpers); never inline values. Secrets stay in env vars only. - Per-test wall-clock timeouts: 90s is reasonable for a single mentor turn; 5 minutes for a full practice run.
GitHub live API tests
We ship a focused suite that exercises the live GitHub App installation and verifies end-to-end sync behaviour (repository metadata, labels, milestones, and teams).
Prerequisites
-
Sandbox installation – the
Hephaestus IntegrationTestsGitHub App must be installed in a sandbox organisation you control. The tests create and delete repositories, milestones, labels, and teams on each run. -
Credentials – provide both a GitHub App private key and a Personal Access Token with the following scopes:
repo(full)admin:orgread:packages
-
Local config file – copy the template that lives alongside the tests:
cd server/application-server/src/test/resourcescp application-live-local.example.yml application-live-local.ymlFill in the placeholders with the sandbox organisation slug, the installation id, and either an inline PEM key (
github.app.privateKey) or a readableprivateKeyLocation. Keep this file out of version control – it is already listed in.gitignore.Alternatively, export the matching environment variables:
export GH_APP_ID=2250297export GH_APP_INSTALLATION_ID=93512943export GH_APP_PRIVATE_KEY="$(cat /path/to/private-key.pem)"export GH_APP_PAT=ghp_xxx... # PAT with the scopes aboveexport GH_APP_ORGANIZATION=HephaestusTestUse either the config file or environment variables; the test suite checks both and aborts if key material is missing.
Running the suite
From server/application-server/ run:
./mvnw test -Plive-tests
The -Plive-tests profile tells Maven to run only tests tagged with @Tag("live"). This is the single guard for live tests – if you don't pass the profile, the tests simply won't run.
Live tests never run during normal CI. They are explicitly excluded from mvn test and mvn verify via tag filtering in the Surefire and Failsafe plugins.
The run takes roughly two minutes and prints the GitHub artefacts it provisions. Clean-up is handled automatically, but if a failure interrupts execution you can safely delete any hephaestus-it-* repositories, milestones, or teams that remain in the sandbox.
Authoring new GitHub sync tests
- Extend
AbstractGitHubLiveSyncIntegrationTest(or fall back toBaseGitHubLiveIntegrationTestwhen repositories are not needed). These bases set up credential checks, provide theworkspaceRepository, and expose helpers such ascreateEphemeralRepository,registerRepositoryToMonitor,createEphemeralTeam, andseedOrganizationMembers. - Use the supplied helpers to create and track temporary GitHub artefacts. They automatically register clean-up handlers via
@AfterEach, so add new resources to the provided lists instead of implementing manual deletion logic. - Keep tests deterministic: rely on
databaseTestUtils.cleanDatabase()in@BeforeEach(already invoked by the base), and generate unique slugs vianextEphemeralSlug("suffix")when naming repositories, branches, or teams. - If a scenario needs extra Spring configuration, extend
application-live-local.ymlinserver/application-server/src/test/resources/. The checked-in.examplefile documents every property; copy it on demand and keep secrets out of version control.
Troubleshooting
- Skipping because of missing credentials – check the console output; the base test class verifies that the App id, private key, PAT, and installation id are all present before executing.
- GitHub API rate-limit failures – the suite creates several entities per run. Prefer a dedicated sandbox organisation so you do not clash with production automation limits.
- Longer runtimes – each suite bootstraps a Testcontainers PostgreSQL instance and provisions GitHub resources. Expect higher runtimes than the pure Testcontainers integration tests; avoid running them on every PR and instead use them before releases or when touching the GitHub sync layer.