Comprehensive Test

Runs all 14 verification agents in parallel phases for complete SIOPV quality gates.

Anthropic Best Practices Applied (Feb 2026)

Practice	Implementation
Model routing	Haiku for simple, Sonnet for analysis
Tool guidance	Each prompt specifies TOOLS TO USE / DO NOT USE
Effort budgets	5-80 tool calls per agent
Parallel execution	Foundation (5) and Phase (8) agents run in parallel
Extended thinking	Complex agents use REASONING steps
LLM-as-judge	Orchestrator verifies report format
Checkpointing	MANIFEST.md tracks status
Progressive disclosure	Agent prompts in separate files

bash

TIMESTAMP=$(date +%Y-%m-%d-%H-%M)
mkdir -p ~/siopv/claude-verification-reports/$TIMESTAMP

Write MANIFEST using templates/manifest.md. Replace {TIMESTAMP} with actual value.

Save to: ~/siopv/claude-verification-reports/{TIMESTAMP}/MANIFEST.md

Launch ALL 5 in parallel. Read each prompt file and pass to Task tool:

Agent	Prompt File	Model	Subagent
Best Practices	prompts/best-practices.md	haiku	Explore
Security	prompts/security.md	sonnet	Explore
Hallucination	prompts/hallucination.md	sonnet	Explore
Code Review	prompts/code-review.md	sonnet	Explore
Coverage	prompts/coverage.md	haiku	Bash

Invocation pattern:

code

Task(subagent_type="[Subagent]", model="[Model]", prompt="[Content from prompt file with {TIMESTAMP} replaced]")

Launch ALL 8 in parallel:

Phase	Name	Status	Prompt
1	Ingestion	Active	prompts/phase-1-ingestion.md
2	RAG/CRAG	Active	prompts/phase-2-rag.md
3	ML Classification	Active	prompts/phase-3-ml.md
4	Orchestration	Active	prompts/phase-4-orchestration.md
5	Authorization	Active	prompts/phase-5-authorization.md
6	DLP	Stub	prompts/phase-stub.md
7	HITL	Stub	prompts/phase-stub.md
8	Output	Stub	prompts/phase-stub.md

For stubs (6-8): Write stub report directly, no agent needed.

After agents complete, YOU verify each report:

If any check fails:

Read all 13 reports, extract status, generate summary using templates/summary.md.

Save to: ~/siopv/claude-verification-reports/{TIMESTAMP}/00-COMPREHENSIVE-SUMMARY.md

See reference/quality-gates.md for threshold definitions.