LLM Testing Guidelines

Tests for LLM-related functionality should follow these guidelines to ensure consistency and reliability.

Test File Structure

•

Place all LLM-related tests in apps/web/__tests__/:

code

apps/web/__tests__/
│ └── your-feature.test.ts
│ └── another-feature.test.ts
└── ...

•

Basic test file template:

typescript

import { describe, expect, test, vi, beforeEach } from "vitest";
import { yourFunction } from "@/utils/ai/your-feature";

// Run with: pnpm test-ai TEST

vi.mock("server-only", () => ({}));

const TIMEOUT = 15_000;

// Skip tests unless explicitly running AI tests
const isAiTest = process.env.RUN_AI_TESTS === "true";

describe.runIf(isAiTest)("yourFunction", () => {
  beforeEach(() => {
    vi.clearAllMocks();
  });

  test("test case description", async () => {
    // Test implementation
  });
}, TIMEOUT);

Helper Functions

•

Always create helper functions for common test data:

typescript

function getUser() {
  return {
    email: "user@test.com",
    aiModel: null,
    aiProvider: null,
    aiApiKey: null,
    about: null,
  };
}

function getTestData(overrides = {}) {
  return {
    // Default test data
    ...overrides,
  };
}

Test Cases

•
Include these standard test cases:
- •Happy path with expected input
- •Error handling
- •Edge cases (empty input, null values)
- •Different user configurations
- •Various input formats

•

Example test structure:

typescript

test("successfully processes valid input", async () => {
  const result = await yourFunction({
    input: getTestData(),
    user: getUser(),
  });
  expect(result).toMatchExpectedFormat();
});

test("handles errors gracefully", async () => {
  const result = await yourFunction({
    input: getTestData({ invalid: true }),
    user: getUser(),
  });
  expect(result.error).toBeDefined();
});

Best Practices

•

Set appropriate timeouts for LLM calls:

typescript

const TIMEOUT = 15_000;
test("handles long-running LLM operations", async () => {
  // ...
}, TIMEOUT);

•
Use descriptive console.debug for generated content:
typescript
```
console.debug("Generated content:\n", result.content);
```
•
Do not mock the LLM call. We want to call the actual LLM in these tests.

•

Test both AI and non-AI paths:

typescript

test("returns unchanged when no AI processing needed", async () => {
  const input = getTestData({ requiresAi: false });
  const result = await yourFunction(input);
  expect(result).toEqual(input);
});

•
Use existing helpers from @/__tests__/helpers.ts:

•getEmailAccount(overrides?) - Creates EmailAccountWithAI objects
•getEmail(overrides?) - Creates EmailForLLM objects
•getRule(instructions, actions?) - Creates rule objects
•getMockMessage(options?) - Creates mock message objects
•getMockExecutedRule(options?) - Creates executed rule objects

Always prefer using existing helpers over creating custom ones.

Running Tests

Run AI tests with:

bash

pnpm test-ai your-feature