Spring AI
This guide explains how to set up a local LLM (Large Language Model) service for development purposes using LM Studio. This allows you to test Hyperion's AI-assisted features locally without requiring cloud API keys.
For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.
Prerequisites
- A running Artemis instance with the
localprofile - Sufficient disk space for downloading LLM models (typically 10-40GB per model)
Setting up LM Studio
Install LM Studio
Install LM Studio using Homebrew:
brew install --cask lm-studio
Download and Configure a Model
Download LM Studio
- Open the LM Studio GUI application and choose Developer
- Download the
gpt-oss-20bmodel (this will take a few moments depending on your internet connection) - Make sure "Enable local LLM service on login" is active
The GUI application enables you to see logs and interact with the model in a chat similar to ChatGPT.
Configure LM Studio via CLI
LM Studio provides a CLI tool (lms) for managing models and the server. For more information, see the LM Studio CLI documentation.
-
Download a model
lms get openai/gpt-oss-20b -
Load the model to activate it
lms load openai/gpt-oss-20b --context-length 32000 -
Start the LMS server
lms server start
The LLM service should now be running on http://localhost:1234.
Verify the Service Works
Verify http://localhost:1234/api/v0/models lists the models.
The model openai/gpt-oss-20b should appear in the list.
Configuring Artemis
Add the following configuration to your application-local.yml file (or adapt your existing configuration accordingly):
artemis:
hyperion:
enabled: true
spring:
ai:
model:
chat: openai
openai:
base-url: http://localhost:1234
api-key: dummy-key # not required for local LLM service
chat:
completions-path: /v1/chat/completions
options:
model: openai/gpt-oss-20b # or whatever name your local server expects
temperature: 0.7
For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.
Testing the Integration
- Start the Artemis server (Make sure the Artemis Server Run Configuration includes the
localprofile) - Start the Artemis client
- Create a programming exercise and open it via the Edit in Editor button
- Click the "Check Consistency" button
- The Artemis Server logs should include something along the lines of:
SpringAIConfiguration : Found Chat Model: openai/gpt-oss-20b - View the LMS Server logs
- Via CLI
lms log stream - By opening LM Studio GUI and activating the Developer view
- Via CLI
- Verify the logs
- include the prompt sent from Artemis
- show progress status
- show the generated prediction
Performance Considerations
- Model Size: Larger models provide better results but require more RAM and processing power. The
gpt-oss-20bmodel is a good balance for development. - Context Length: The
--context-length 32000parameter controls how much text the model can process at once. Adjust based on your hardware capabilities. - GPU Acceleration: LM Studio automatically uses GPU acceleration when available (Metal on macOS, CUDA on Linux/Windows).
🛠 Troubleshooting
Exit Code 6 When Loading a Model
If you encounter the following error:
🥲 Failed to load the model
Error loading model. (Exit code: 6)
This usually means the model cannot load because the required runtime engine is missing or broken.
✅ Fix: Repair the MLX Runtime (macOS Apple Silicon)
-
Open LM Studio
-
In the left menu, click LM Runtimes
-
Under Runtime Extension Packs, find:
LM Studio MLX
(Engine used for MLX / Apple Silicon models)
-
If LM Studio detects an issue, a Fix button will appear
-
Click Fix to reinstall, update, or repair the engine
After the repair completes, try loading your model again.