Skip to main content

Spring AI

This guide explains how to set up a local LLM (Large Language Model) service for development purposes using LM Studio. This allows you to test Hyperion's AI-assisted features locally without requiring cloud API keys.

ℹ️

For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.

Prerequisites

  • A running Artemis instance with the local profile
  • Sufficient disk space for downloading LLM models (typically 10-40GB per model)

Setting up LM Studio

Install LM Studio

Install LM Studio using Homebrew:

brew install --cask lm-studio

Download and Configure a Model

Download LM Studio

  1. Open the LM Studio GUI application and choose Developer
  2. Download the gpt-oss-20b model (this will take a few moments depending on your internet connection)
  3. Make sure "Enable local LLM service on login" is active

The GUI application enables you to see logs and interact with the model in a chat similar to ChatGPT.

Configure LM Studio via CLI

LM Studio provides a CLI tool (lms) for managing models and the server. For more information, see the LM Studio CLI documentation.

  1. Download a model

    lms get openai/gpt-oss-20b
  2. Load the model to activate it

    lms load openai/gpt-oss-20b --context-length 32000
  3. Start the LMS server

    lms server start

The LLM service should now be running on http://localhost:1234.

Verify the Service Works

Verify http://localhost:1234/api/v0/models lists the models.

The model openai/gpt-oss-20b should appear in the list.

Configuring Artemis

Add the following configuration to your application-local.yml file (or adapt your existing configuration accordingly):

artemis:
hyperion:
enabled: true
spring:
ai:
model:
chat: openai
openai:
base-url: http://localhost:1234
api-key: dummy-key # not required for local LLM service
chat:
completions-path: /v1/chat/completions
options:
model: openai/gpt-oss-20b # or whatever name your local server expects
temperature: 0.7
⚠️

For production setups with cloud providers (OpenAI, Azure OpenAI), refer to the Hyperion Admin Setup.

Testing the Integration

  1. Start the Artemis server (Make sure the Artemis Server Run Configuration includes the local profile)
  2. Start the Artemis client
  3. Create a programming exercise and open it via the Edit in Editor button
  4. Click the "Check Consistency" button
  5. The Artemis Server logs should include something along the lines of:
    SpringAIConfiguration      : Found Chat Model: openai/gpt-oss-20b
  6. View the LMS Server logs
    1. Via CLI
      lms log stream
    2. By opening LM Studio GUI and activating the Developer view
  7. Verify the logs
    1. include the prompt sent from Artemis
    2. show progress status
    3. show the generated prediction

Performance Considerations

  • Model Size: Larger models provide better results but require more RAM and processing power. The gpt-oss-20b model is a good balance for development.
  • Context Length: The --context-length 32000 parameter controls how much text the model can process at once. Adjust based on your hardware capabilities.
  • GPU Acceleration: LM Studio automatically uses GPU acceleration when available (Metal on macOS, CUDA on Linux/Windows).

🛠 Troubleshooting

Exit Code 6 When Loading a Model

If you encounter the following error:

🥲 Failed to load the model
Error loading model. (Exit code: 6)

This usually means the model cannot load because the required runtime engine is missing or broken.

✅ Fix: Repair the MLX Runtime (macOS Apple Silicon)

  1. Open LM Studio

  2. In the left menu, click LM Runtimes

  3. Under Runtime Extension Packs, find:

    LM Studio MLX

    (Engine used for MLX / Apple Silicon models)

  4. If LM Studio detects an issue, a Fix button will appear

  5. Click Fix to reinstall, update, or repair the engine

After the repair completes, try loading your model again.