Mock vs Real Testing

Athena’s testing framework employs a dual approach:

Mock Tests for fast, isolated unit testing and Real Tests for comprehensive integration testing. This section explains the differences, use cases, and implementation details of each approach.

Mock Testing

Purpose and Benefits

Mock tests provide fast, reliable, and isolated testing by replacing external dependencies with controlled mock objects. They are ideal for:

Unit Testing: Testing individual functions and methods in isolation
Fast Feedback: Quick test execution during development
Deterministic Results: Consistent outcomes regardless of external factors
CI/CD Integration: Reliable automated testing without external dependencies

Mock Test Structure

Mock tests are located in mock/ directories and typically include:

mock/
├── conftest.py              # Mock fixtures and configuration
├── test_*.py               # Mock test files
└── utils/                  # Mock utilities
    ├── mock_config.py      # Mock configuration objects
    ├── mock_llm.py         # Mock LLM implementations
    ├── mock_openai.py      # Mock OpenAI API responses
    └── mock_env.py         # Mock environment variables

Key Mock Components

Mock LLM Responses

class MockLanguageModel:
    def __init__(self):
        self.responses = {
            "feedback_suggestion": "This is a mock feedback response.",
            "grading_analysis": "Mock grading analysis result."
        }

    async def ainvoke(self, prompt):
        # Return predetermined responses based on prompt content
        return self.responses.get("feedback_suggestion", "Default mock response")

Mock Configuration Objects

class MockApproachConfig:
    def __init__(self):
        self.max_input_tokens = 5000
        self.model = MockModelConfig()
        self.type = "default"

Mock Environment Variables

@pytest.fixture(autouse=True)
def mock_env_vars(monkeypatch):
    monkeypatch.setenv("MOCK_MODE", "true")
    monkeypatch.setenv("API_KEY", "mock_api_key")

Mock Test Example

import pytest
from unittest.mock import patch
from module_text_llm.default_approach import generate_suggestions
from tests.modules.text.module_text_llm.mock.utils.mock_config import MockApproachConfig

@pytest.mark.asyncio
async def test_feedback_generation_mock(mock_config, mock_llm):
    """Test feedback generation with mocked LLM responses."""

    # Arrange
    exercise = create_mock_exercise()
    submission = create_mock_submission()

    # Act
    feedbacks = await generate_suggestions(
        exercise=exercise,
        submission=submission,
        config=mock_config,
        debug=False,
        is_graded=True,
        learner_profile=None
    )

    # Assert
    assert len(feedbacks) > 0
    assert all(f.title for f in feedbacks)
    assert all(f.description for f in feedbacks)

Real Testing

Purpose and Benefits

Real tests provide comprehensive integration testing by using actual APIs and services. They are essential for:

Integration Testing: Testing complete workflows with real dependencies
API Validation: Ensuring compatibility with external services
Performance Testing: Measuring actual response times and resource usage
End-to-End Validation: Verifying complete system functionality

Real Test Structure

Real tests are located in real/ directories and include:

real/
├── conftest.py              # Real test fixtures and configuration
├── test_*.py               # Real test files
├── data/                   # Real test data
│   └── exercises/          # Exercise JSON files
│       ├── exercise-6715.json
│       ├── exercise-6787.json
│       └── ...
└── test_data/              # Additional test data (modeling module)
    ├── ecommerce_data.py
    └── hospital_data.py

Real Test Configuration

Azure OpenAI Configuration

@pytest.fixture
def real_config():
    """Create a real configuration for testing with Azure OpenAI."""
    return DefaultApproachConfig(
        max_input_tokens=5000,
        model=AzureModelConfig(
            model_name="azure_openai_gpt-4o",
            get_model=lambda: None,  # Set by the module
        ),
        type="default",
    )

Environment Setup

@pytest.fixture(scope="session", autouse=True)
def setup_environment():
    """Setup environment for real tests."""
    nltk.download("punkt", quiet=True)
    nltk.download("punkt_tab", quiet=True)

Real Test Example

import pytest
from module_text_llm.default_approach import generate_suggestions
from tests.modules.text.module_text_llm.real.conftest import real_config

@pytest.mark.asyncio
async def test_feedback_generation_real(real_config, playground_loader):
    """Test feedback generation with real LLM API calls."""

    # Load real exercise data
    exercise_data = playground_loader.load_exercise(4)
    exercise = playground_loader.convert_to_athena_exercise(exercise_data)

    # Create test submission
    submission_data = {"id": 401, "text": "MVC test"}
    submission = playground_loader.convert_to_athena_submission(submission_data, exercise.id)

    # Generate feedback with real API
    feedbacks = await generate_suggestions(
        exercise=exercise,
        submission=submission,
        config=real_config,
        debug=False,
        is_graded=True,
        learner_profile=None
    )

    # Validate real API responses
    assert len(feedbacks) > 0
    assert all(f.title for f in feedbacks)
    assert all(f.description for f in feedbacks)

Test Data Management

Mock Test Data

Mock tests use programmatically generated data:

In-Memory Objects: Created within test functions
Mock Fixtures: Reusable mock objects defined in conftest.py
Deterministic Responses: Predictable mock LLM responses
No External Files: All data generated at runtime

Real Test Data

Real tests use persistent JSON data files:

Exercise Files: Complete exercise definitions with submissions
Historical Data: Real student submissions and feedback
Multiple Scenarios: Various difficulty levels and submission types
Version Control: Data files tracked in git for consistency

Data File Structure

Real test data follows this JSON structure:

{
    "id": 6715,
    "course_id": 101,
    "title": "Software Design Patterns",
    "type": "text",
    "max_points": 10,
    "bonus_points": 0,
    "problem_statement": "Explain the following design patterns...",
    "grading_instructions": "Full points for correct identification...",
    "example_solution": "Singleton pattern ensures...",
    "meta": {},
    "submissions": [
        {
            "id": 201,
            "text": "Student submission text...",
            "meta": {},
            "feedbacks": [
                {
                    "id": 301,
                    "title": "Pattern Identification",
                    "description": "Good identification of Singleton pattern",
                    "credits": 2.0,
                    "meta": {}
                }
            ]
        }
    ]
}

When to Use Each Approach

Use Mock Tests When:

Testing individual functions or methods
Ensuring code works without external dependencies

Use Real Tests When:

Validating complete workflows
Testing API integrations