5-Stage Verification Pipeline

•Exact match - Case-insensitive, normalized (paris = Paris)
•Numeric match - Float comparison (42 = 42.0)
•Fuzzy match - RapidFuzz 85% threshold (handles typos: Pari → Paris)
•Semantic match - sentence-transformers 0.8 threshold (synonyms: car → automobile)
•Lemmatized match - spaCy (tense: running → run)

Pipeline stops at first match (early exit optimization).

Model Loading

Lazy load once at startup in main.py lifespan:

python

answer_service = AnswerService()
await asyncio.to_thread(answer_service.load_models)  # ~5-10s, 300MB RAM

Models:

•sentence-transformers: all-MiniLM-L6-v2 (~80MB, 200MB RAM)
•spaCy: en_core_web_md (43MB, 300MB RAM)

CPU Mode (Default): Backend runs without GPU using CUDA_VISIBLE_DEVICES="". This is the standard configuration for both development and production (~850MB total RAM). GPU acceleration is not needed for jDuel's answer verification workload.

Testing Pattern

Unit Tests: Use MockAnswerService to avoid loading models:

python

class MockAnswerService:
    def is_correct(self, submitted: str, correct: str) -> bool:
        return submitted.lower().strip() == correct.lower().strip()

Integration Tests: Use real AnswerService with CUDA_VISIBLE_DEVICES="":

bash

cd backend/src
CUDA_VISIBLE_DEVICES="" uv run pytest ../tests/integration/

The backend starts quickly (~5-10s) even with model loading when GPU is disabled.

Multiple Choice Mode

For MC questions, bypass NLP (exact match only):

python

if question.is_multiple_choice:
    is_correct = submitted.strip() == question.answer.strip()
else:
    is_correct = answer_service.is_correct(submitted, question.answer)