LLM Testing Guidelines
Tests for LLM-related functionality should follow these guidelines to ensure consistency and reliability.
Test File Structure
- •
Place all LLM-related tests in
apps/web/__tests__/:codeapps/web/__tests__/ │ └── your-feature.test.ts │ └── another-feature.test.ts └── ...
- •
Basic test file template:
typescriptimport { describe, expect, test, vi, beforeEach } from "vitest"; import { yourFunction } from "@/utils/ai/your-feature"; // Run with: pnpm test-ai TEST vi.mock("server-only", () => ({})); const TIMEOUT = 15_000; // Skip tests unless explicitly running AI tests const isAiTest = process.env.RUN_AI_TESTS === "true"; describe.runIf(isAiTest)("yourFunction", () => { beforeEach(() => { vi.clearAllMocks(); }); test("test case description", async () => { // Test implementation }); }, TIMEOUT);
Helper Functions
- •
Always create helper functions for common test data:
typescriptfunction getUser() { return { email: "user@test.com", aiModel: null, aiProvider: null, aiApiKey: null, about: null, }; } function getTestData(overrides = {}) { return { // Default test data ...overrides, }; }
Test Cases
- •
Include these standard test cases:
- •Happy path with expected input
- •Error handling
- •Edge cases (empty input, null values)
- •Different user configurations
- •Various input formats
- •
Example test structure:
typescripttest("successfully processes valid input", async () => { const result = await yourFunction({ input: getTestData(), user: getUser(), }); expect(result).toMatchExpectedFormat(); }); test("handles errors gracefully", async () => { const result = await yourFunction({ input: getTestData({ invalid: true }), user: getUser(), }); expect(result.error).toBeDefined(); });
Best Practices
- •
Set appropriate timeouts for LLM calls:
typescriptconst TIMEOUT = 15_000; test("handles long-running LLM operations", async () => { // ... }, TIMEOUT); - •
Use descriptive console.debug for generated content:
typescriptconsole.debug("Generated content:\n", result.content); - •
Do not mock the LLM call. We want to call the actual LLM in these tests.
- •
Test both AI and non-AI paths:
typescripttest("returns unchanged when no AI processing needed", async () => { const input = getTestData({ requiresAi: false }); const result = await yourFunction(input); expect(result).toEqual(input); }); - •
Use existing helpers from
@/__tests__/helpers.ts:
- •
getEmailAccount(overrides?)- Creates EmailAccountWithAI objects - •
getEmail(overrides?)- Creates EmailForLLM objects - •
getRule(instructions, actions?)- Creates rule objects - •
getMockMessage(options?)- Creates mock message objects - •
getMockExecutedRule(options?)- Creates executed rule objects
Always prefer using existing helpers over creating custom ones.
Running Tests
Run AI tests with:
bash
pnpm test-ai your-feature