High-Fidelity Testing standards
High-fidelity tests in GUTTERS verify the actual system logic against real hardware (PostgreSQL, Redis) and external APIs. These are "Service Integration Tests" that prove the app works exactly as it would in production.
Core Mandates
- •NO Mocks for Core Logic: If you are testing a module, use the real class instance.
- •Real Infrastructure: Use the actual PostgreSQL (Supabase) and Redis instances from
.env. - •No Persistence Mocks: Verify data actually exists in the database/cache after an operation.
- •Full Loop Mastery: Tests must respect the async lifecycle and singleton resets.
- •UTC-Aware Datetimes: NEVER compare naive and aware datetimes. Use
datetime.now(timezone.utc). - •Session Isolation: Modules MUST accept a
dbsession and use it, rather than opening their own viaasync_get_db.
Infrastructure Patterns
1. Database Connectivity (pgbouncer)
When testing against Supabase (pgbouncer), standardPrepared Statements often fail.
- •Port: Use
5432(Session Mode) in test engines. - •Pool: Use
NullPooland setstatement_cache_size=0. - •Pre-ping: Set
pool_pre_ping=Falseto avoid async loop conflicts.
2. Singleton Cleanup (CRITICAL)
Singletons bind to the event loop they are created in. In pytest-asyncio, loops are often function-scoped. You MUST reset singletons between tests.
# conftest.py
@pytest_asyncio.fixture(autouse=True)
async def cleanup_test_state():
yield
# Dispose engine
from src.app.core.db.database import async_engine
await async_engine.dispose()
# Nullify singletons
from src.app.core.memory.active_memory import ActiveMemory
ActiveMemory._active_memory = None
from src.app.core.memory.synthesis_orchestrator import SynthesisOrchestrator
SynthesisOrchestrator._orchestrator = None
3. Redis Hygiene (CRITICAL)
Tests share the same Redis instance. You MUST flush the cache between tests to prevent data bleeding (state pollution).
# conftest.py
@pytest_asyncio.fixture
async def memory():
"""Get ActiveMemory instance with real Redis connection."""
mem = get_active_memory()
await mem.initialize()
# FLUSH REDIS to ensure clean slate
if mem.redis_client:
await mem.redis_client.flushdb()
return mem
4. Mapper Configuration (CRITICAL for Circular Dependencies)
When models have circular string references (e.g., User <-> ChatSession), SQLAlchemy mappers must be explicitly configured before use in tests to prevent InvalidRequestError or UnboundExecutionError.
# conftest.py
@pytest_asyncio.fixture(scope="session", autouse=True)
def configure_mappers_session():
"""Ensure all SQLAlchemy mappers are initialized for the session."""
from sqlalchemy.orm import configure_mappers
from src.app.models.user import User
from src.app.models.chat_session import ChatSession
configure_mappers()
5. Datetime Standardization
PostgreSQL stores DateTime(timezone=True) as UTC.
- •Seeding: Always use aware objects:
datetime.now(timezone.utc). - •Parsing: Ensure ISO strings are treated as UTC-aware:
datetime.fromisoformat(ts_str.replace('Z', '+00:00')).
Test Structure
1. Deterministic Seeding
Instead of mocking responses, seed the database with the data the real module expects.
# ✅ CORRECT
await db.execute(insert(UserProfile).values(user_id=user_id, data={"astrology": {...}}))
2. Patching Best Practices
When you must patch (only for avoiding external side effects like synthesis triggers or heavy computations you want to isolate), check imports carefully.
- •The Import Trap: If
module.pydoesfrom app.core import x, and you patchsrc.app.core.x, it might NOT work due to aliasing. Patch the source definition. - •Verification: Use
id()in debug prints to verify the object usage.
# ✅ CORRECT (Patch the source definition if aliasing is ambiguous)
with patch("app.core.memory.synthesis_orchestrator.get_orchestrator", ...)
Gotchas
- •Serialization Errors:
ActiveMemory.set_jsonexpects Pydantic models.ActiveMemory.sethandles dicts/strings. BEWAREnumpytypes - convert to standard Python types before storing. - •Unicode Errors: Avoid checkmarks (✓) or emojis in test print statements on Windows.
- •DetachedInstanceError: Always read required fields (like
id) before closing the DB session in a fixture. - •Initialization Failure:
RuntimeError: EventBus not initialized. Fix:await get_event_bus().initialize()inconftest.pyor the test. - •Circular Imports: Collection fails with
AttributeError. Fix: Use deferred imports inside methods.
6. Phase 7+ End-to-End (E2E) Verification
The E2E suite (tests/integration/test_end_to_end.py) is the ultimate proof of work. It verifies:
- •Registration -> Calculation flow.
- •Synthesis creation & Active Memory storage.
- •Master Chat with full context retrieval.
- •Generative UI component lifecycle.
- •Intelligence layer detection (Observer/Hypothesis).
Gold Standard: Every major backend PR must pass the E2E suite with zero warnings.
6. Semantic Retrieval Verification
Testing vector search requires verifying relevance without brittle exact-match logic.
- •Similarity Thresholds: Assert that results meet a minimum baseline (e.g.,
> 0.3). - •Data Presence: Verify that seeded data is actually indexed before searching (check DB count).
- •Hybrid Grouping: If using hybrid search, verify all expected categories have at least one result.
# ✅ CORRECT - Testing Vector Search results = await engine.search(user_id, query_vector, db) # 1. Verify index isn't empty count = await db.scalar(select(func.count()).select_from(Embedding)) assert count > 0 # 2. Verify top result relevance assert results[0]['similarity'] > 0.3 assert "expected keyword" in results[0]['content'].lower() # 3. Verify hybrid coverage grouped = await engine.hybrid_search(user_id, query_vector, db) assert len(grouped['journal_entries']) > 0
Verification Checklist
- • Does the test use the real database (no
mock_db)? - • Are singletons being reset in
conftest.py? - • Is Redis flushed (
flushdb) in the memory fixture? - • Are external API calls actually firing (or using the
VCR/aiohttppatch)?