Spring AI

This guide explains how to set up a local LLM (Large Language Model) service for development purposes using LM Studio. This allows you to test Hyperion's AI-assisted features locally without requiring cloud API keys.

ℹ️

For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.

Prerequisites

A running Artemis instance with the local profile
Sufficient disk space for downloading LLM models (typically 10-40GB per model)

Setting up LM Studio

Install LM Studio

Install LM Studio using Homebrew:

brew install --cask lm-studio

Download and Configure a Model

Download LM Studio

Open the LM Studio GUI application and choose Developer
Download the gpt-oss-20b model (this will take a few moments depending on your internet connection)
Make sure "Enable local LLM service on login" is active

The GUI application enables you to see logs and interact with the model in a chat similar to ChatGPT.

Configure LM Studio via CLI

LM Studio provides a CLI tool (lms) for managing models and the server. For more information, see the LM Studio CLI documentation.

Download a model
```
lms get openai/gpt-oss-20b
```

Load the model to activate it

lms load openai/gpt-oss-20b --context-length 32000

Start the LMS server
```
lms server start
```

The LLM service should now be running on http://localhost:1234.

Verify the Service Works

Verify http://localhost:1234/api/v0/models lists the models.

The model openai/gpt-oss-20b should appear in the list.

Configuring Artemis

Add the following configuration to your application-local.yml file (or adapt your existing configuration accordingly):

artemis:
  hyperion:
    enabled: true
spring:
  ai:
    model:
      chat: openai
    openai:
      base-url: http://localhost:1234
      api-key: dummy-key # not required for local LLM service
      chat:
        completions-path: /v1/chat/completions
        options:
          model: openai/gpt-oss-20b    # or whatever name your local server expects
          temperature: 0.7

⚠️

For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.

Testing the Integration

Start the Artemis server (Make sure the Artemis Server Run Configuration includes the local profile)
Start the Artemis client
Create a programming exercise and open it via the Edit in Editor button
Click the "Check Consistency" button
The Artemis Server logs should include something along the lines of:
```
SpringAIConfiguration      : Found Chat Model: openai/gpt-oss-20b
```
View the LMS Server logs
1. Via CLI
```
lms log stream
```
2. By opening LM Studio GUI and activating the Developer view
Verify the logs
1. include the prompt sent from Artemis
2. show progress status
3. show the generated prediction

Performance Considerations

Model Size: Larger models provide better results but require more RAM and processing power. The gpt-oss-20b model is a good balance for development.
Context Length: The --context-length 32000 parameter controls how much text the model can process at once. Adjust based on your hardware capabilities.
GPU Acceleration: LM Studio automatically uses GPU acceleration when available (Metal on macOS, CUDA on Linux/Windows).

🛠 Troubleshooting

If you encounter the following error:

🥲 Failed to load the model
Error loading model. (Exit code: 6)

This usually means the model cannot load because the required runtime engine is missing or broken.

✅ Fix: Repair the MLX Runtime (macOS Apple Silicon)

Open LM Studio
In the left menu, click LM Runtimes
Under Runtime Extension Packs, find:

LM Studio MLX

(Engine used for MLX / Apple Silicon models)
If LM Studio detects an issue, a Fix button will appear
Click Fix to reinstall, update, or repair the engine

After the repair completes, try loading your model again.

Prerequisites​

Setting up LM Studio​

Install LM Studio​

Download and Configure a Model​

Download LM Studio​

Configure LM Studio via CLI​

Verify the Service Works​

Configuring Artemis​

Testing the Integration​

Performance Considerations​

🛠 Troubleshooting​

Exit Code 6 When Loading a Model​

✅ Fix: Repair the MLX Runtime (macOS Apple Silicon)​