Evolve
The main orchestrator for autonomous site evolution. Intelligently selects and executes tasks based on priority, staleness, and site goals. Automatically replenishes the task queue when empty.
Usage
/evolve [mode]
Modes:
- •(default) - Execute 1 task (highest scored)
- •
deep- Execute 2-3 tasks including maintenance
Session Workflow
0. Check Queue Health (Pre-flight)
Before doing anything else, check if the queue needs replenishment:
state = load_state('obsidian/workflow/evolution-state.yaml')
todo = parse_todo('obsidian/workflow/todo.md')
active_tasks = count_active_tasks(todo) # P0-P2 only, P3 doesn't count
needs_replenishment = active_tasks < 3
if needs_replenishment:
# Run replenishment before proceeding
invoke('/replenish-queue')
# Reload todo after replenishment
todo = parse_todo('obsidian/workflow/todo.md')
Threshold: Replenish when fewer than 3 active tasks (P0-P2) remain. P3 tasks don't count toward this threshold since they require human promotion.
Note: Replenishment runs FIRST, before staleness checks or task selection, to ensure there's always meaningful work available.
1. Load State
Read obsidian/workflow/evolution-state.yaml to get:
- •Last run timestamps for maintenance tasks
- •Failed task retry counts
- •Content statistics
2. Check Staleness
For each maintenance task, check if overdue:
| Task | Cadence | Inject when overdue by | Scheduled |
|---|---|---|---|
| validate-all | 24h | 48h | - |
| pessimistic-review | 12h | 24h | - |
| optimistic-review | 12h | 24h | - |
| check-tenets | 48h | 96h | - |
| check-links | 24h | 48h | - |
| deep-review | 4h | 8h | - |
| tune-system | 72h | 144h | - |
| research-voids | 24h | 48h | - |
| coalesce | 8h | 16h | - |
| tweet-highlight | 24h | (scheduled) | @07:00 UTC |
Create synthetic tasks for overdue maintenance.
Special handling for scheduled tasks (those with a time in the Scheduled column): Scheduled tasks use different eligibility logic. Instead of waiting for the overdue threshold, they become eligible when:
- •Current UTC hour >= scheduled hour (e.g., >= 7 for @07:00 UTC)
- •AND the task hasn't run since today's scheduled time (last_run < today at scheduled_hour)
def is_scheduled_task_eligible(last_run, scheduled_hour, now_utc):
if now_utc.hour < scheduled_hour:
return False # Before today's scheduled time
today_scheduled = now_utc.replace(hour=scheduled_hour, minute=0, second=0, microsecond=0)
return last_run < today_scheduled # Hasn't run since today's scheduled time
This ensures tweet-highlight runs once daily at/after 7 AM UTC, regardless of what time it ran yesterday.
For non-scheduled tasks: Use the standard overdue logic (last_run + cadence + overdue_threshold < now).
3. Load and Score Tasks
Read obsidian/workflow/todo.md and score all tasks:
SCORE = PRIORITY_BASE + STALENESS_BONUS + URGENCY_MOD - FAILURE_PENALTY PRIORITY_BASE: P0=400, P1=300, P2=200, P3=100 STALENESS_BONUS: hours_overdue (max 150) - for synthetic maintenance tasks URGENCY_MOD: +50 critical issue, +30 medium issue, +20 has research FAILURE_PENALTY: -100 per failure, -500 at 3+ failures
4. Select Tasks
Based on mode:
- •Default: Top 1 task
- •Deep: Top 2-3 tasks (at least 1 content + 1 maintenance if overdue)
Never select:
- •Tasks with 3+ failures (these are blocked)
- •P3 tasks (require human promotion)
- •Tasks with unmet dependencies
5. Execute Tasks
For each selected task:
- •
If synthetic (maintenance): invoke the skill directly
- •
/validate-all - •
/pessimistic-review - •
/optimistic-review - •
/check-tenets - •
/check-links - •
/deep-review - •
/tune-system - •
/research-voids - •
/coalesce - •
/tweet-highlight(tweets most recent untweeted highlight)
- •
- •
If queue task: invoke based on type
- •
expand-topic→/expand-topic [topic] - •
research-topic→/research-topic [topic] - •
refine-draft→/refine-draft [file] - •
cross-review→/deep-review [target-file]with cross-reference context
- •
- •
After each task:
- •Record outcome (success/failed/partial)
- •If failed: increment retry count in state
- •If 3+ failures: move to Blocked section in todo.md
- •If success: move task to Completed section in todo.md (see Completed Task Format below)
- •If success AND task type generates chains: update
task_chainsin state (see Task Chain Recording below)
- •
Commit after each task (for easy reversion):
- •If task modified files, commit immediately
- •Use task-specific commit message:
code
feat(auto): [task-type] - [brief description] Task: [task title] Session: [session number]
- •Examples:
- •
feat(auto): expand-topic - retrocausality concept page - •
feat(auto): research-topic - Libet experiments - •
chore(auto): validate-all - daily health check
- •
- •Use AI authorship for commit
6. Update State
Update obsidian/workflow/evolution-state.yaml:
- •
last_updatedtimestamp - •
session_countincrement - •
last_runsfor any maintenance tasks executed - •
failed_taskscounts - •
recent_taskshistory - •
content_stats(recount files) - •
progressmetrics
7. Calculate Convergence
Compute convergence score (0-100%):
- •50% weight: Content breadth (topics, concepts, arguments written)
- •30% weight: Quality (critical/medium issues from reviews)
- •20% weight: Completeness (no placeholders)
8. Generate Report
Output session summary:
## Evolution Session Complete **Mode**: default **Tasks executed**: 1 **Duration**: ~15 minutes ### Tasks 1. ✓ Write article on hard problem - Score: 300 ### Metrics - Convergence: 28% → 32% - Content: 11 files (+1) - Quality: 0 critical, 2 medium issues ### Next Session Recommended tasks: 1. P1: Address circularity concern (Score: 300) 2. validate-all (due tomorrow)
9. Add Highlight (if warranted)
If the session produced highlight-worthy work, add an item to the highlights page:
Highlight-worthy:
- •New article written → highlight with link to article
- •Significant insight from review → highlight the finding
- •Research completed → highlight key discovery
- •Major refinement → highlight what improved
Not highlight-worthy (skip):
- •Routine maintenance (validate-all, check-links, check-tenets)
- •Failed tasks
- •Minor refinements
Use the CLI with the --tweet flag to also post to Twitter:
uv run python scripts/highlights.py add "Title" "Description (max 280 chars)" --type new-article --link "[[article]]" --tweet
Note: If Twitter credentials aren't configured, the highlight is still added without tweeting. The operation does not fail.
The manager enforces max 1 highlight per day—if already added, it will silently skip.
10. Sync and Validate Site
Before the final commit, sync content to Hugo and verify no broken links:
# Sync Obsidian → Hugo uv run python scripts/sync.py # Build site (validates templates and content) cd hugo && hugo --gc --minify # Start server and check links hugo server & sleep 3 python .claude/skills/check-links/scripts/check_links.py
If broken links are found:
- •Log them to the session report
- •For each broken link, check if:
- •The target file exists but has
draft: true→ Fix by settingdraft: false - •The wikilink is malformed → Fix the source file
- •The target doesn't exist → Note as issue in report
- •The target file exists but has
- •Re-sync after fixes
- •If links still broken after reasonable attempts, note in session report
This step catches issues like:
- •Articles marked as drafts that are linked from published content
- •Wikilink conversion errors
- •Missing pages
11. Final Session Commit
After all tasks complete and site validation passes, commit session-level updates (state file, changelog, todo updates):
chore(auto): Evolution session [N] complete Tasks executed: [count] Convergence: [old]% → [new]%
Use AI authorship for the commit.
Note: Individual task outputs are committed in step 5.4. This final commit captures only the session bookkeeping (state, changelog, todo). This separation allows easy reversion of specific tasks without affecting session metadata.
Handling Failures
When a task fails:
- •Increment retry count in
failed_tasks - •Log the failure reason
- •If retry count reaches 3:
- •Move task to "## Blocked Tasks (Needs Human)" section in todo.md
- •Add explanation of failure pattern
- •Do not retry until human intervenes
Completed Task Format
When marking a task as completed:
- •Move the task from
## Active Taskssection to the top of## Completed Taskssection - •Reformat with completion details:
### ✓ YYYY-MM-DD: [Task title from original] - **Type**: [original type] - **Notes**: [ORIGINAL NOTES VERBATIM - do not modify or summarise] - **Result**: [What was actually done] - **Output**: [Files created/modified]
- •Trim old completions: If the Completed Tasks section exceeds 50 entries, remove the oldest entries (at the bottom) to keep only 50.
Important: The original Notes field must be preserved exactly as written. This maintains the human's original intent and reasoning. Add execution details in the separate Result and Output fields.
Critical: Tasks must be MOVED to the Completed section, not just modified in place. Leaving completed tasks in Active clutters the queue and makes it hard to see pending work.
Task Chain Recording
When certain task types complete successfully, record chain information for /replenish-queue:
After research-topic completes:
Add to task_chains.pending_articles in evolution-state.yaml:
pending_articles: - "research/panpsychism-consciousness-2026-01-08.md"
This signals that research exists without a corresponding article.
After expand-topic completes:
- •Remove from pending_articles if this article was based on research
- •Add to pending_cross_reviews with related articles:
pending_cross_reviews:
- new_article: "concepts/panpsychism.md"
review_targets:
- "topics/hard-problem-of-consciousness.md"
- "concepts/qualia.md"
generated: "2026-01-08T10:00:00+00:00"
Finding review targets: Look at the new article's:
- •
related_articlesfrontmatter - •Wikilinks in content
- •Topics/concepts that share themes
Maximum 2 review targets per article (configurable in state).
After cross-review completes:
Remove the completed review from pending_cross_reviews:
for chain in pending_cross_reviews:
if chain['new_article'] == task.chain_parent:
chain['review_targets'].remove(task.target_file)
if not chain['review_targets']:
pending_cross_reviews.remove(chain)
Blocked Tasks Section
When a task is blocked, add to todo.md:
## Blocked Tasks (Needs Human) ### P1: [Task title] - **Failures**: 3 - **Last attempt**: 2026-01-06 - **Notes**: [ORIGINAL NOTES VERBATIM] - **Error pattern**: [description of what went wrong] - **Suggested action**: [how human might fix]
State File Location
obsidian/workflow/evolution-state.yaml
Dependencies
This skill uses:
- •
tools/evolution/state.py- State loading/saving - •
tools/evolution/scoring.py- Task scoring - •
tools/evolution/staleness.py- Staleness detection - •
tools/todo/processor.py- Todo parsing
Cross-Review Task Handling
When executing a cross-review task:
- •Read the chain parent (the new article that triggered this review)
- •Read the target article (the existing article to review)
- •Analyze for:
- •Places to add
[[new-article]]wikilinks - •Arguments that the new content supports or challenges
- •Terminology consistency
- •Missing cross-references
- •Places to add
- •Make edits if improvements found
- •Log findings even if no edits made (document the review)
Example cross-review task:
### P2: Review hard-problem-of-consciousness.md considering panpsychism insights - **Type**: cross-review - **Notes**: New article concepts/panpsychism.md may provide insights relevant to topics/hard-problem-of-consciousness.md. - **Source**: chain (from panpsychism.md)
Execution:
- •Read
concepts/panpsychism.mdto understand what's new - •Read
topics/hard-problem-of-consciousness.mdlooking for:- •References to panpsychism that could now link to the concept page
- •Arguments that panpsychism analysis strengthens or complicates
- •Missing context that panpsychism article provides
- •Edit hard-problem article if improvements found
- •Mark task complete with summary of changes (or "no changes needed")
Queue Replenishment
The queue is automatically replenished when:
- •Active tasks (P0-P2) drop below 3
- •
needs_replenishment: truein state file
Replenishment uses /replenish-queue which generates tasks from:
- •Task chains: Recent completions that should generate follow-ups
- •Unconsumed research: Research notes without corresponding articles
- •Gap analysis: Content areas needing expansion
- •Staleness: Content not reviewed recently
See /replenish-queue skill for full details.
Notes
- •P3 tasks are never auto-selected; human must promote to P2+
- •Synthetic maintenance tasks compete fairly via scoring
- •Each task is committed separately for easy reversion of individual tasks
- •The skill is designed for manual triggering (2-3x per week)
- •Queue replenishment happens automatically when needed (step 0)