TDD Orchestrator
Coordinates autonomous app development through test-driven development with fresh-context agent handoffs.
Prerequisites
Before running the orchestrator, install one of the supported backends:
Option 1: Claude Backend (default)
pip install claude-agent-sdk anyio
Option 2: Gemini Backend
npm install -g @google/gemini-cli
Verify installation:
- •Claude:
python -c "from claude_agent_sdk import query; print('OK')" - •Gemini:
gemini --version
Supported Backends
| Backend | SDK/Tool | Use Flag |
|---|---|---|
| Claude (default) | Claude Agent SDK | --backend claude (or omit) |
| Gemini | Gemini CLI | --backend gemini |
Both backends provide the same functionality - choose based on your preferred AI provider.
Architecture Overview
User Spec → [Initializer Agent] → features.json + initial tests
↓
┌─────────────────────┐
│ Orchestrator Loop │
│ (reads progress.txt)│
└──────────┬──────────┘
↓
┌────────────────────┴────────────────────┐
↓ ↓
[Feature Agent 1] [Feature Agent N]
(fresh context) (fresh context)
↓ ↓
Run tests → Pass? → Update progress.txt
↓
Loop until all features complete
Project Structure
Initialize workspace with this structure:
project_root/
├── app_spec.md # Original user specification
├── features.json # Feature list with test cases (from initializer)
├── progress.txt # Tracks completed features + context for next agent
├── src/ # Application source code
├── tests/ # Test files
└── .tdd/
├── agent_logs/ # Logs from each agent run
└── test_results/ # Test output history
Workflow
Phase 1: Initialization
- •Create
app_spec.mdfrom user input - •Invoke initializer agent with tdd-initializer skill
- •Initializer produces
features.jsonand scaffolds testable project
Phase 2: Feature Loop
For each incomplete feature in features.json:
- •Read
progress.txtfor context - •Spawn fresh feature agent with tdd-feature-agent skill
- •Agent implements feature, runs tests
- •On test pass: invoke tdd-committer skill to commit changes
- •Committer verifies tests, stages files, commits with conventional message
- •On test fail: log failure, retry with error context (max 3 attempts)
- •Continue until all features complete
Phase 3: Completion
- •Run full test suite
- •Generate completion summary
- •Clean up agent logs if successful
Running the Orchestrator
Using auto_runner.py (Recommended)
The auto_runner.py script provides full autonomous orchestration with backend selection:
# Initialize new project (Claude backend - default) python scripts/auto_runner.py init ./my-app spec.md # Initialize with Gemini backend python scripts/auto_runner.py init ./my-app spec.md --backend gemini # Run orchestration loop (Claude) python scripts/auto_runner.py run ./my-app # Run with Gemini python scripts/auto_runner.py run ./my-app --backend gemini # Or using short flag python scripts/auto_runner.py run ./my-app -b gemini # Check project status python scripts/auto_runner.py status ./my-app
Using orchestrate.py (Manual Mode)
The orchestrate.py script generates prompts for manual agent invocation:
# Initialize new project python scripts/orchestrate.py init --spec "path/to/spec.md" # Resume existing project python scripts/orchestrate.py resume --project "path/to/project" # Run single feature (for debugging) python scripts/orchestrate.py feature --project "path/to/project" --feature "feature_id"
Configuration
Create .tdd/config.json for customization:
{
"max_retries_per_feature": 3,
"test_command": "pytest",
"test_timeout_seconds": 300,
"parallel_features": false
}
Backend-Specific Configuration
The backend is selected at runtime via CLI flags, not config file. Each backend uses its own model defaults:
- •Claude: Uses
claude-sonnet-4-20250514by default - •Gemini: Uses
gemini-2.5-proby default
Progress File Format
progress.txt is the handoff document between agents:
=== TDD Progress Report === Project: MyApp Last Updated: 2025-01-12T10:30:00Z ## Completed Features - [x] F001: User authentication (3 tests passed) [a1b2c3d] - [x] F002: Database schema (5 tests passed) [e4f5g6h] ## Current Feature - [ ] F003: REST API endpoints ## Context for Next Agent - Using SQLite for storage (see src/db.py) - Auth tokens stored in JWT format - API follows OpenAPI 3.0 spec ## Known Issues - Rate limiting not yet implemented - Need to add input validation ## File Summary - src/auth.py: Authentication logic - src/db.py: Database models and queries - src/api.py: API route handlers (in progress)
Error Handling
When feature implementation fails:
- •Log full error to
.tdd/agent_logs/ - •Append error context to
progress.txtunder "## Last Error" - •Retry with error context included in next agent prompt
- •After max retries, pause and request human intervention