Librarian - Documentation Prose Architect
Analyze the prose content of documentation - what it says, how it explains concepts, where ideas are scattered or duplicated - to reorganize by semantic meaning, not just file structure.
Core Philosophy
The librarian examines PROSE, not just code.
The librarian reads documentation like a human reader would, understanding:
- •What each file explains - The concepts, workflows, and ideas described in prose
- •How explanations relate - Where the same idea is explained differently in multiple places
- •Where prose contradicts - Conflicting explanations, not just conflicting numbers
- •What's scattered vs. cohesive - Ideas that belong together split across files
- •What's mixed vs. focused - Unrelated explanations crammed into one file
This is NOT about:
- •Fixing broken links (that's a side effect)
- •Checking operator consistency (that's mechanical)
- •Enforcing line limits (that's a constraint)
- •Validating code blocks (that's syntax)
This IS about:
- •Understanding what each document is trying to explain
- •Finding where explanations are incomplete, scattered, or contradictory
- •Reorganizing prose so concepts flow logically
- •Consolidating duplicate explanations into single authoritative sources
What "Examining Prose" Means
When you read a documentation file, ask:
PROSE ANALYSIS QUESTIONS:
├── UNDERSTANDING:
│ ├── What concept is this file trying to explain?
│ ├── Who is the intended reader?
│ ├── What should the reader know after reading this?
│ └── What does this file ASSUME the reader already knows?
│
├── QUALITY:
│ ├── Is the explanation complete?
│ ├── Is the explanation clear?
│ ├── Are there gaps in the logic?
│ └── Would a reader be confused by anything?
│
├── RELATIONSHIPS:
│ ├── What other concepts does this explanation reference?
│ ├── Are those concepts explained here or elsewhere?
│ ├── If elsewhere, is the reference clear?
│ └── Does this explanation contradict any other?
│
└── ORGANIZATION:
├── Does this explanation belong in this file?
├── Is this file trying to explain too many things?
├── Is this explanation scattered across multiple files?
└── Would reorganizing improve understanding?
Prose Issues vs Mechanical Issues
| Prose Issues (Focus Here) | Mechanical Issues (Secondary) |
|---|---|
| Same concept explained 3 different ways | Broken markdown link |
| Conflicting instructions for same scenario | Missing file reference |
| Workflow explanation split across 5 files | Line count over limit |
| Tutorial mixed with reference material | Inconsistent heading levels |
| Incomplete explanation assumes too much | Duplicate signal format |
| Confusing explanation with unclear logic | Wrong anchor tag |
Always prioritize prose issues. Mechanical issues often fix themselves when prose is reorganized correctly.
Parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
path | No | .claude/docs | Directory to analyze |
line_limit | No | 500 | Maximum lines per file (constraint) |
mode | No | refactor | refactor, audit, or consistency |
Invocation
/librarian # Full refactor of .claude/docs /librarian mode=audit # Report only, no changes /librarian mode=consistency # Check for contradictions only /librarian path=docs line_limit=400 # Custom path and stricter limit
Phase 1: Semantic Analysis (READ THE PROSE)
1.1 Build Concept Map
Read ALL documentation files as prose and extract:
CONCEPT EXTRACTION:
├── For each file, READ IT and identify:
│ ├── Primary concept (what is this file fundamentally ABOUT?)
│ │ → Not "what's in the filename" but "what does it explain"
│ │
│ ├── Key explanations (what does it teach the reader?)
│ │ → Summarize each major section's purpose
│ │
│ ├── Audience assumptions (what does it assume you know?)
│ │ → What would confuse a new reader?
│ │
│ ├── Terminology defined (what terms does it establish?)
│ │ → Look for definitions, not just usage
│ │
│ └── Dependencies (what must you read first?)
│ → What concepts are referenced but not explained?
│
├── Build concept index:
│ ├── concept → [files that EXPLAIN it in prose]
│ ├── concept → [files that MENTION it without explaining]
│ └── concept → [files that ASSUME reader knows it]
│
└── Identify concept clusters:
├── Concepts that form a coherent topic (should be together)
├── Concepts that build on each other (should be ordered)
└── Concepts that conflict (should be reconciled)
1.2 Map Explanation Patterns
Understand HOW concepts are explained:
EXPLANATION PATTERN ANALYSIS:
├── Explanation styles found:
│ ├── Tutorial (step-by-step, "do this then this")
│ ├── Reference (lookup, "X means Y")
│ ├── Conceptual (understanding, "why X matters")
│ └── Example-driven (showing, "here's X in action")
│
├── Mixed styles in same file (potential split):
│ ├── File tries to be both tutorial AND reference
│ ├── File mixes conceptual overview with implementation detail
│ └── File combines quick-start with exhaustive reference
│
├── Incomplete explanations:
│ ├── Concept mentioned but never fully explained
│ ├── Workflow described but steps are vague
│ ├── Term used but never defined
│ └── "See X" but X doesn't exist
│
└── Redundant explanations:
├── Same concept explained in multiple files
├── Same workflow described with different wording
├── Same template shown in multiple places
└── Same rules stated in different contexts
1.3 Detect Prose Fragmentation
Find explanations that are SCATTERED when they should be TOGETHER:
FRAGMENTATION DETECTION:
├── Same concept explained partially in multiple files:
│ Example: "Expert delegation" explained 30% in expert-delegation.md,
│ 20% in escalation-specification.md, 50% in agent-conduct.md
│ → Consolidate into single authoritative explanation
│
├── Workflow explanation split across files:
│ Example: Steps 1-3 in file A, steps 4-6 in file B, step 7 missing
│ → Reunify or clearly link the sequence
│
├── Definition separated from context:
│ Example: Signal format in signals.md, usage rules in workflows.md
│ → Either consolidate or create clear cross-references
│
├── Examples separated from concepts:
│ Example: Concept explained in chapter.md, examples in examples.md
│ → Consider reunifying unless examples are extensive
│
└── Prerequisites scattered:
Example: "You need X" in file A, "X works like..." in file C
→ Reader can't follow without hunting through multiple files
1.4 Detect Prose Mixing
Find explanations that are TOGETHER when they should be SEPARATE:
MIXING DETECTION:
├── Multiple unrelated concepts in one file:
│ Example: File covers state management AND signal formats AND recovery
│ → These serve different readers at different times - separate them
│
├── Mixed audiences:
│ Example: Developer how-to mixed with orchestrator implementation details
│ → Developers don't need orchestrator details - separate them
│
├── Mixed depths:
│ Example: Quick-start overview mixed with exhaustive reference
│ → Reader wants one or the other - separate them
│
├── Mixed purposes:
│ Example: "What is X" (conceptual) mixed with "How to X" (tutorial)
│ → These serve different needs - consider separating
│
└── Mixed time contexts:
Example: Setup (do once) mixed with runtime (do repeatedly)
→ Reader needs these at different times - separate them
Phase 2: Consistency Analysis (CHECK THE PROSE)
2.1 Semantic Contradictions
Check for CONFLICTING EXPLANATIONS (not just conflicting numbers):
CONTRADICTION DETECTION:
├── Same concept explained differently:
│ Example: File A says "agents can delegate to experts"
│ File B says "agents must solve problems themselves"
│ → CRITICAL: Resolve which explanation is correct
│
├── Conflicting instructions:
│ Example: File A says "always signal before exiting"
│ File B says "exit immediately on failure"
│ → CRITICAL: Clarify when each applies
│
├── Conflicting definitions:
│ Example: File A defines "task" as single work item
│ File B uses "task" to mean entire workflow
│ → CRITICAL: Standardize terminology
│
├── Conflicting assumptions:
│ Example: File A assumes reader knows signals
│ File B assumes reader is learning signals
│ → These files serve different audiences - clarify or separate
│
└── Implicit contradictions:
Example: File A implies X is always true
File B describes a case where X is false
→ Make the exception explicit
2.2 Explanation Completeness
Check for GAPS in explanations:
GAP DETECTION:
├── Concepts referenced but never explained:
│ Example: "Use the signal format" - but what IS the signal format?
│ → Add explanation or clear reference
│
├── Workflows with missing steps:
│ Example: "1. Do X, 2. Do Y, 4. Do Z" - what's step 3?
│ → Complete the workflow
│
├── Assumptions stated but not justified:
│ Example: "Always use method A" - why? When is method B appropriate?
│ → Explain the reasoning
│
├── Edge cases ignored:
│ Example: "Process the input" - what if input is invalid?
│ → Address edge cases or state assumptions
│
└── Reader questions unanswered:
Example: After reading, would a reader still wonder "but what about...?"
→ Anticipate and answer those questions
2.3 Referential Consistency
Check that references match reality:
REFERENTIAL CHECKS:
├── Link claims match content:
│ Example: Link says "see signal formats" but target explains workflows
│ → Fix the link or the description
│
├── Cross-references are bidirectional:
│ Example: A links to B, but B doesn't acknowledge A
│ → Add backlink if relationship is important
│
├── Hierarchies are consistent:
│ Example: Index claims file is about X, file is actually about Y
│ → Update index or clarify file's purpose
│
└── Navigation matches content organization:
Example: Navigation implies reading order A→B→C but C should come first
→ Fix navigation to match logical flow
Phase 3: Reorganization Planning (PLAN PROSE CHANGES)
3.1 Consolidation Planning
Plan where to MERGE scattered explanations:
CONSOLIDATION PLANNING:
├── For each fragmented concept:
│ ├── Which file should be the AUTHORITATIVE source?
│ │ → Usually the most complete/accurate version
│ │
│ ├── What content needs to MOVE there?
│ │ → List specific prose sections, not just "merge files"
│ │
│ ├── What should REMAIN in source files?
│ │ → Brief mention + reference, or nothing?
│ │
│ └── How will readers find it?
│ → Update navigation, add redirects
│
└── Consolidation output format:
CONSOLIDATE: "Expert Delegation"
FROM: expert-delegation.md (lines 50-120), escalation-spec.md (lines 30-45)
TO: expert-delegation.md
REASON: Concept explained twice with slight variations
ACTION: Merge prose, pick clearer wording, reference from other file
3.2 Separation Planning
Plan where to SPLIT mixed content:
SEPARATION PLANNING:
├── For each mixed file:
│ ├── What DISTINCT explanations does it contain?
│ │ → List them as separate topics
│ │
│ ├── Can they stand alone?
│ │ → Each new file should make sense independently
│ │
│ ├── How should they be named?
│ │ → Names should reflect the explanation, not arbitrary sections
│ │
│ └── How will they link to each other?
│ → Cross-references for related concepts
│
└── Separation output format:
SEPARATE: "state-management.md"
INTO:
- state/fields.md: "What state fields exist" (lines 10-200)
- state/updates.md: "When state changes" (lines 201-400)
- state/recovery.md: "How to recover state" (lines 401-500)
REASON: Three distinct topics serving different needs
ACTION: Create focused files, add index, cross-reference
3.3 Movement Planning
Plan where to RELOCATE misplaced content:
MOVEMENT PLANNING:
├── Content in wrong file:
│ Example: Signal parsing details in workflow-overview.md
│ → Move to signals.md where it belongs conceptually
│
├── Content at wrong level:
│ Example: Implementation details in quick-start.md
│ → Move to detailed reference
│
├── Content for wrong audience:
│ Example: Orchestrator internals in developer-guide.md
│ → Move to orchestrator documentation
│
└── Movement output format:
MOVE: "Signal parsing implementation"
FROM: task-dispatch.md (lines 200-350)
TO: signals/parsing.md
REASON: Parsing details don't belong in dispatch workflow
ACTION: Move prose, leave brief reference in source
3.4 Line Limit Compliance
AFTER semantic reorganization, check line limits:
LINE LIMIT ANALYSIS (Secondary to semantic concerns): ├── Files over limit AFTER semantic fixes: │ ├── Is the content genuinely cohesive? → Exception may be justified │ ├── Can it be split semantically? → Find natural boundaries in prose │ └── Would splitting harm understanding? → Keep together if so │ ├── Line limits are a CONSTRAINT, not a GOAL: │ ├── Never split mid-explanation to meet a limit │ ├── Never refuse consolidation because result is large │ ├── Semantic organization > line counts │ └── Report files over limit with recommendations
Phase 4: Refactoring Execution
4.1 Execution Order
Execute changes prioritizing semantic correctness:
EXECUTION SEQUENCE: 1. BUILD CHANGE MANIFEST (before any edits) - List all files to be modified - Track all paths/anchors that will change - This enables reference integrity checking later 2. RESOLVE CONTRADICTIONS (highest priority) - Determine correct explanation - Update conflicting files to align - Document the resolution 3. CONSOLIDATE SCATTERED EXPLANATIONS - Merge prose into authoritative source - Replace duplicates with references - Preserve all unique information 4. SEPARATE MIXED CONTENT - Create focused files for distinct topics - Move prose to appropriate locations - Add navigation and cross-references 5. MOVE MISPLACED CONTENT - Relocate prose to correct conceptual homes - Update source to reference new location - Fix navigation paths 6. ADDRESS SIZE (only if needed after above) - Semantic splits only - Never split mid-explanation - Create navigation for split content 7. REFERENCE INTEGRITY (MANDATORY - See Phase 5) - Scan ENTIRE project for references to changed files - Update ALL broken references - Verify zero orphaned references remain - THIS STEP IS NOT OPTIONAL 8. VERIFY CONSISTENCY - Re-check for contradictions - Confirm explanations are complete - Validate all links work
4.2 Prose Preservation Rules
When moving/merging explanations:
PRESERVATION RULES: ├── NEVER lose explanations - all prose must survive reorganization ├── NEVER change meaning - preserve explanatory intent ├── PREFER clearer wording when merging duplicate explanations ├── PRESERVE examples that illustrate concepts ├── ADD context when moving content to new location ├── UPDATE references so readers can find moved content └── DOCUMENT changes - what moved where and why
4.3 Navigation Maintenance
Ensure readers can find reorganized content:
NAVIGATION RULES: ├── Every explanation reachable within 1 level from entry points ├── Index files summarize what each file EXPLAINS ├── Related explanations linked bidirectionally ├── Clear reading paths for different audiences ├── No dead ends (every file leads somewhere) └── Moved content findable from old locations (redirects or references)
Phase 5: Reference Integrity (MANDATORY)
This phase is NOT optional. Every librarian run that modifies files MUST complete this phase before finishing.
5.1 Build Change Manifest
Before any edits, track what will change:
CHANGE MANIFEST (build before editing):
├── Files to be modified:
│ ├── {file_path}: {what changes}
│ └── ...
│
├── Files to be moved/renamed:
│ ├── {old_path} → {new_path}
│ └── ...
│
├── Anchors to be changed:
│ ├── {file}#{old_anchor} → {file}#{new_anchor}
│ └── ...
│
└── Content relocated:
├── "{concept}" from {source} to {target}
└── ...
5.2 Execute Reference Scan
After ALL edits are complete, scan the ENTIRE project for references:
# Scan all markdown files in project for references to modified files # For each file in CHANGE MANIFEST: grep -r "old_filename\.md" .claude/ --include="*.md" grep -r "old_anchor" .claude/ --include="*.md"
Scan locations (MANDATORY - do not skip any):
- •
.claude/docs/- All documentation - •
.claude/agents/- Generated agent files - •
.claude/skills/- Skill definitions - •
.claude/commands/- Command definitions - •
.claude/prompts/- Prompt templates - •
README.md,CLAUDE.md- Root documentation - •Any other
.mdfiles in the project
5.3 Update All References
For each broken reference found:
REFERENCE UPDATE PROCEDURE: 1. Read the file containing the broken reference 2. Identify the reference context (what is it linking to?) 3. Update to new location/anchor 4. Verify the new reference is valid 5. Save the file
Reference patterns to check:
- •
[text](path/to/file.md)- Markdown links - •
[text](path/to/file.md#anchor)- Anchored links - •
See [document](path)- Inline references - •
**Reference**: [Name](path)- Formal references - •
path/to/file.md- Plain path mentions
5.4 Verify Reference Integrity
After updating all references, VERIFY nothing is broken:
VERIFICATION PROCEDURE:
1. RE-SCAN for old paths/anchors
- grep -r "{old_path}" .claude/ --include="*.md"
- Result MUST be empty (zero matches)
2. VALIDATE new references exist
- For each updated reference, confirm target file exists
- For anchored links, confirm anchor exists in target
3. CHECK for orphaned files
- Files that were sources of moved content
- Ensure they either redirect or are deleted
4. REPORT results
- List all references updated
- Confirm zero broken references remain
5.5 Reference Integrity Output
Include in final report:
REFERENCE INTEGRITY CHECK
=========================
Change Manifest:
- {N} files modified
- {N} files moved/renamed
- {N} anchors changed
References Updated: {count}
- {file}: Updated {N} references to {target}
- ...
Verification:
✓ Zero references to old paths remain
✓ All new references validated
✓ No orphaned files
Scanned Locations:
✓ .claude/docs/ ({N} files)
✓ .claude/agents/ ({N} files)
✓ .claude/skills/ ({N} files)
✓ .claude/commands/ ({N} files)
✓ .claude/prompts/ ({N} files)
If ANY broken references remain, the librarian run is INCOMPLETE.
Output Formats
Audit Mode Output
LIBRARIAN AUDIT REPORT
======================
Directory: {path}
Files Analyzed: {count}
PROSE CONTRADICTIONS (must resolve):
------------------------------------
CRITICAL: {concept} explained inconsistently
- {file_a}: "{explanation_a}" (lines {range})
- {file_b}: "{explanation_b}" (lines {range})
Resolution needed: {what needs to be decided}
FRAGMENTED EXPLANATIONS (should consolidate):
---------------------------------------------
Concept "{concept}" explained in pieces:
- {file1}: Explains {aspect1} (lines {range})
- {file2}: Explains {aspect2} (lines {range})
- {file3}: Explains {aspect3} (lines {range})
Recommendation: Consolidate to {target}, reference from others
MIXED CONTENT (should separate):
--------------------------------
File "{file}" mixes unrelated topics:
- {topic_a}: {description} (lines {range})
- {topic_b}: {description} (lines {range})
Recommendation: Separate into {file_a}, {file_b}
INCOMPLETE EXPLANATIONS:
------------------------
- {file}: {concept} mentioned but not explained
- {file}: {workflow} missing steps {which}
- {file}: {term} used but never defined
SIZE ISSUES (after semantic fixes):
-----------------------------------
| File | Lines | Semantic Split Possible? |
|------|-------|--------------------------|
| {file} | {lines} | {yes/no - why} |
Run with mode=refactor to apply changes.
Refactor Mode Output
LIBRARIAN REFACTOR COMPLETE
===========================
CHANGE MANIFEST:
----------------
Files modified: {count}
Files moved/renamed: {count}
Anchors changed: {count}
CONTRADICTIONS RESOLVED: {count}
--------------------------------
1. {concept}: Chose {file}'s explanation as authoritative
Updated: {other_files}
Reason: {why this explanation is correct}
EXPLANATIONS CONSOLIDATED: {count}
----------------------------------
1. "{concept}" unified from {count} sources → {target}
Prose merged: {line_count} lines
Sources now reference: {target}
CONTENT SEPARATED: {count}
--------------------------
1. {source} split into:
- {file1}: Explains {topic1}
- {file2}: Explains {topic2}
CONTENT MOVED: {count}
----------------------
1. {topic} moved from {source} to {target}
Reason: {why it belongs there}
REFERENCE INTEGRITY (MANDATORY):
--------------------------------
Locations scanned:
✓ .claude/docs/ ({N} files)
✓ .claude/agents/ ({N} files)
✓ .claude/skills/ ({N} files)
✓ .claude/commands/ ({N} files)
✓ .claude/prompts/ ({N} files)
References updated: {count}
- {file}: {old_ref} → {new_ref}
- ...
Verification:
✓ Re-scan found ZERO broken references
✓ All new references validated
PROSE VERIFICATION:
-------------------
✓ No contradicting explanations remain
✓ Each concept has single authoritative source
✓ All explanations complete and findable
✓ Navigation paths preserved
Concept Affinity Rules
When deciding what explanations belong together:
High Affinity (same file)
- •Concept definition and its examples
- •Workflow overview and its steps
- •Rule and its exceptions
- •Format specification and its usage
Medium Affinity (same directory, linked)
- •Overview and detailed reference
- •Tutorial and troubleshooting
- •Concept and related concepts
Low Affinity (separate, cross-linked)
- •Different roles' documentation
- •Different phases of a process
- •Conceptual vs implementation
Quality Checks
Before Refactoring
- • All contradicting explanations identified
- • Consolidation preserves all unique information
- • Separations create coherent standalone topics
- • Movements improve conceptual organization
- • Change manifest built (list of all files/paths/anchors to change)
After Refactoring
- • No contradictions remain
- • Each concept explained in ONE place authoritatively
- • Related concepts linked
- • Readers can find what they need
- • All prose preserved
Reference Integrity (MANDATORY)
- • All project locations scanned for references to changed files
- • All broken references updated to new locations
- • Zero orphaned references confirmed via re-scan
- • Verification report included in output
The librarian run is NOT complete until reference integrity is verified.
Error Handling
| Issue | Action |
|---|---|
| Contradictions can't be resolved from prose alone | Flag for human decision |
| Explanation doesn't fit anywhere cleanly | Create new topic category |
| Consolidation would create huge file | Accept if cohesive, flag if not |
| Separation would fragment explanation | Keep together |
Prerequisites
| Requirement | Purpose |
|---|---|
Read access to path | Read and understand documentation prose |
Write access to path | Reorganize files (refactor mode) |
| Task tool | Parallel analysis agents to read files |
| Understanding | Actually READ the prose, don't just scan |