Executing Spikes
Overview
Execute one approach from a spike definition autonomously, exploring feasibility without asking permission or polishing code.
Core principle: Fast exploration over clean code. Prove it works, don't perfect it.
Announce at start: "I'm using the Executing Spikes skill to explore approach N."
Context Override Authority
This skill explicitly overrides:
- •TDD rules (skip test-first discipline, prove it works however is fastest)
- •Code quality rules (messy code is GOOD during spikes)
- •"Ask for permission" rules (make decisions autonomously, document assumptions)
These overrides are NOT permission exceptions - they are the correct behavior for spike execution.
The Process
Copy this checklist to track progress:
Executing Spikes Progress: - [ ] Phase 1: Setup Worktree (worktree created, data stores isolated) - [ ] Phase 2: Load Spike Definition & Choose Approach (spike notes read, approach chosen) - [ ] Phase 3: Autonomous Exploration (implementation complete, quick-and-dirty code) - [ ] Phase 4: Proving It Works (test script runs, output captured) - [ ] Phase 5: Push Until Natural Stop (reached natural stopping point) - [ ] Phase 6: Discovery Report (findings documented, work committed)
Phase 1: Setup Worktree
- •Announce: "I'm using the Using Git Worktrees skill to set up spike workspace."
- •Use skills/collaboration/using-git-worktrees
- •Branch from
spike-[canonical-name]creatingspike-[canonical-name]-N - •Partner tells you which number to use (1, 2, 3...)
Data Store Isolation (Any Project with Databases/State)
CRITICAL: Each spike must use its own data stores to prevent parallel spikes from conflicting.
Applies to: PostgreSQL, MySQL, SQLite files, Redis databases, MongoDB collections, etc.
Before creating schema or running migrations, verify isolation:
For Rails projects, check both development AND test databases:
# Check what database you'll use bin/rails db:migrate:status # Expected: database name should be spike-specific # ✅ Good: spike_overlay_data_model_2_development # ✅ Good: spike_overlay_data_model_2_test # ❌ Bad: myapp_development (shared across all spikes)
For other frameworks, verify equivalent isolation mechanism exists.
If data stores are NOT isolated:
- •STOP and implement isolation (check config for branch/worktree-based naming)
- •If you cannot figure out how to isolate data stores, STOP and ask partner for guidance before proceeding
- •Do NOT proceed with shared data stores - parallel spikes will conflict
Why critical: Without isolation, parallel spikes will drop each other's tables/collections, wasting hours debugging phantom failures that only occur when multiple spikes run simultaneously.
Phase 2: Load Spike Definition & Choose Approach
- •Read
spike-notes-[canonical-name].mdfrom the base spike branch - •Copy to your worktree if needed
- •Extract approach number from branch name
- •Example:
spike-replace-3d-vectors-2→ approach 2
- •Example:
- •If that numbered approach exists in notes: use it
- •If that numbered approach doesn't exist: Create one, document it in spike notes
- •Document your chosen approach details
Phase 3: Autonomous Exploration
Execute independently:
- •Make ALL decisions yourself (library choices, architecture, error handling)
- •Document assumptions in spike notes
- •Quick-and-dirty over clean code
- •Duplication is fine, inconsistent naming is fine, messy code is GOOD
- •Don't stop to validate choices
- •Don't ask for permission
- •Push through minor obstacles with workarounds
Code Quality Expectations for Spikes:
- •✅ GOOD: Duplicated code across 3 places
- •✅ GOOD: Inconsistent naming
- •✅ GOOD: Quick hacks and workarounds
- •✅ GOOD: Copy-pasted code
- •✅ GOOD: Hardcoded values
- •❌ BAD: Spending time refactoring
- •❌ BAD: Extracting shared functions
- •❌ BAD: Consistent abstractions
- •❌ BAD: "Clean" code
The goal is learning speed, not maintainable code.
Phase 4: Proving It Works (Critical)
Your spike MUST actually run and do something.
Minimum requirement: Create executable test script
- •
Create a test file that can be run with a single command:
- •
test_spike.rb/test_spike.py/test.sh/npm run spike-test - •Should test ALL scenarios from spike definition
- •Must print clear output showing pass/fail
- •
- •
Run it and capture output:
- •Don't just write the tests - RUN THEM
- •Copy actual output into your report
- •Output is proof you didn't just write code that "looks right"
- •
Test script should:
- •Setup test data
- •Exercise the spike's core functionality
- •Print results for each scenario
- •Use ✅/❌ or PASS/FAIL markers for clarity
Example test script output:
=== Testing Scenario 1: Base entity ===
✅ Loaded entity: {"name": "Bran", ...}
=== Testing Scenario 2: With overlay ===
✅ Applied overlay, got: {"name": "Bran", "items": ["mace"], ...}
=== Testing Scenario 3: Mutual exclusivity ===
✅ Validation rejected conflicting overlays
Error: "recently-bubbled and 100-years-bubbled are mutually exclusive"
Choose fastest validation method:
Quick validation (prefer these):
- •Test script that exercises all scenarios (recommended)
- •Manual testing with documented steps + output
- •Print statements showing data flow
- •Simple integration showing end-to-end works
Automated tests (use if already faster):
- •Integration tests proving happy path
- •Tests as executable documentation
TDD discipline (SKIP THIS):
- •❌ Test-first workflow
- •❌ Comprehensive coverage
- •❌ Testing edge cases exhaustively
- •❌ RED-GREEN-REFACTOR cycle
The rule: Your spike must work - run it and prove it. Use whatever validation is fastest.
Red flags:
- •❌ "The code looks correct" → Run it
- •❌ "I tested it mentally" → Run it
- •❌ "Logic is sound" → Run it
- •❌ Writing report without running code → Stop, run it first
In your report, include:
- •Path to test script
- •Command to run it
- •Full output (or representative sample if very long)
- •Mapping of output to spike test scenarios
Phase 5: Push Until Natural Stop
Stop when:
- •Feature works end-to-end and you've proven it (success!)
- •Hit genuine blocker you can't work around (missing system dependency, fundamental incompatibility)
- •Discovered approach won't work (fundamental design flaw)
- •Reasonable effort expended (~2-3 hours worth of exploration)
Don't stop when:
- •Code is messy (that's fine - this is exploratory)
- •Hit a minor error (try workaround first)
- •Unsure if approach is "right" (keep going, that's not the spike's purpose)
- •Want to check if design is okay (make the call yourself)
- •Want to refactor (skip it entirely)
- •Tests are incomplete (you're not doing TDD)
Phase 6: Discovery Report
Create a detailed spike report following the standardized template in reference/report-template.md.
Key requirements:
- •12 required sections covering implementation, results, evaluation, and next steps
- •File name:
SPIKE_FINDINGS_APPROACH_N.md - •Evidence-based: Include actual test output, not paraphrases
- •Weighted scoring: Use criteria from spike definition (if provided)
- •Proof of work: Executable test script + actual output demonstrating it works
- •Git workflow: Commit all code and report, don't push unless requested
Critical:
- •No comparisons to other spike approaches (you don't know what they did yet)
- •Include objective criteria: "Works best when X, avoid when Y"
- •Be honest about tradeoffs and limitations
See the full template for detailed structure and examples.
Autonomy: When to Ask vs When to Decide
Ask partner when:
- •Hit genuine blocker (missing system dependency, fundamental incompatibility)
- •Cannot isolate data stores and unsure how to proceed
- •Spike notes file is missing or corrupted
- •Need clarification on spike goal/constraints
Decide independently when:
- •Which library to use → Pick one, document choice
- •How to structure code → Quick-and-dirty wins
- •Whether to refactor messy code → Don't refactor
- •How to handle an error → Try workaround
- •What "good enough" looks like → Working code is enough
- •How to prove it works → Manual test vs automated test vs script
- •Library version conflicts → Use what works, document it
- •Whether to add caching/pooling/metrics → Make the call, document it
- •How thorough to be → Push until natural stop
- •TTL values, configuration, connection settings → Pick reasonable defaults
- •Database naming/isolation strategy → Implement it, document it
- •Test script format → Whatever proves it works fastest
If you're asking "Should I ask about X?" - the answer is: decide and document.
Report format questions:
- •Don't ask "Should I include X in my report?" → Follow the template
- •Don't ask "Is this enough detail?" → Template specifies what's needed
- •Do ask if template section doesn't make sense for your spike type
Red Flags - STOP and Course Correct
If you catch yourself doing these, you're NOT executing a spike correctly:
- •Asking validation questions → "Should I use library X?" → NO, decide and document
- •Refactoring messy code → "This duplication should be cleaned up" → NO, keep pushing
- •Following TDD → "Let me write the test first" → NO, prove it works however is fastest
- •Polishing code → "Let me make this cleaner" → NO, messy is good
- •Not running code → "The logic looks correct" → NO, run it and prove it
- •Seeking permission → "Is it okay to use Docker?" → NO, use it and document
- •Second-guessing scope → "Should I explore additional aspects?" → Push until natural stop
All of these mean: You're applying production standards to exploratory work.
Common Rationalizations to Resist
| Excuse | Reality |
|---|---|
| "The code quality rules are absolute" | Spike context overrides code quality rules |
| "I need permission to deviate from rules" | Spike execution IS permission to be messy |
| "Messy code makes it harder to add features" | That's acceptable for spikes - we're learning, not building |
| "Should refactor before continuing" | NO - refactoring time = lost exploration time |
| "TDD rule says MUST for every feature" | Spikes are not features - they're throwaway exploration |
| "Need permission to skip TDD" | This skill grants that permission explicitly |
| "When in doubt, follow the written rules" | This skill IS the written rules for spikes |
| "Doing it right is better than doing it fast" | For spikes: fast learning beats correctness |
| "Should I check if this approach is okay?" | Make decision, document assumption, move on |
| "This is getting messy, I should clean it up" | Messy is GOOD - it means you're exploring fast |
| "The code looks right, no need to run it" | Assumption ≠ proof. Run it. |
| "I could have been scrappier" | Then BE scrappier - that's what spikes demand |
Completion Verification
Before reporting to your partner that the spike is complete, verify ALL of these:
Copy this verification checklist to ensure nothing was skipped:
Spike Completion Verification:
**Setup:**
- ✅ Data stores are isolated (checked with status command)
- ✅ Working in correct spike worktree
- ✅ Database/state won't conflict with other spikes
**Implementation:**
- ✅ Code actually runs (not just "looks right")
- ✅ Test script exists and executes
- ✅ Test output captured
- ✅ All spike definition scenarios tested
**Report:**
- ✅ Used standardized template (12 required sections)
- ✅ Included weighted scoring with calculation shown
- ✅ Test results map to ALL spike scenarios
- ✅ Time breakdown included
- ✅ Interface/usage design documented (if applicable)
- ✅ Evidence included for every claim
- ✅ Actual test output pasted (not paraphrased)
- ✅ No comparisons to other spike approaches
- ✅ Code quality self-assessment included
**Git:**
- ✅ All work committed
- ✅ Report file committed
- ✅ Commit message follows format
**Red Flags - Stop and Fix:**
- ❌ Report says "it works" but no test output shown
- ❌ Report compares to other approaches ("better than Approach X")
- ❌ Didn't actually run the code
- ❌ Test script doesn't exist or doesn't run
- ❌ Report missing required sections from template
- ❌ No weighted scoring calculation
- ❌ Database isolation not verified
Common New Pitfalls to Avoid
With the updated guidance, watch for these new failure modes:
| Pitfall | Reality |
|---|---|
| "I'll just use shared database, it's simpler" | NO - will break parallel spikes |
| "Report template doesn't fit my spike" | Template is generic - adapt sections, don't skip |
| "Scoring is too subjective" | Show your reasoning - subjective with justification is fine |
| "Test script is too much overhead" | All three spikes created them naturally - it's not overhead |
| "I'll skip the weighted calculation" | Required - makes approaches comparable |
| "My spike doesn't have an interface" | Then write "Not applicable" - don't skip the section |
| "I'll compare to other approaches in my report" | NO - comparison happens after all spikes |
| "Test output is too long to include" | Include representative sample with note about full output |
When NOT to Use This Skill
Don't use for:
- •Production features (use skills/testing/test-driven-development)
- •Well-defined implementations (use skills/collaboration/executing-plans)
- •Code that will be merged as-is (spikes are throwaway exploration)
- •Learning a codebase (use exploration/research skills)
Ask partner: "Is this actually a spike, or should we build this properly with TDD?"
Related Skills
Before spike execution:
- •skills/collaboration/defining-spikes (creates the spike definition)
- •skills/collaboration/using-git-worktrees (sets up isolated workspace)
During exploration:
- •skills/problem-solving/collision-zone-thinking (if stuck in conventional thinking)
After spike:
- •skills/collaboration/requesting-code-review (if approach is viable and will be productionized)
Remember
- •Messy code is GOOD during spikes
- •Make decisions autonomously, document assumptions
- •Prove it works (run it!), don't perfect it
- •Skip TDD discipline, use fastest validation
- •Don't refactor during exploration
- •Stop at natural stopping points
- •Report with evidence ("I ran X, got Y")
- •Use standardized report template for comparability
- •Isolate data stores to avoid parallel spike conflicts