Outline Builder
Convert a taxonomy into a checkable, mappable outline (bullets only).
Bullets should describe what the section must cover, not draft prose.
Role cards (prompt-level guidance)
Use these roles explicitly while drafting the outline. They guide decisions, not phrasing; avoid producing copyable prose sentences.
- •
Outline Architect
- •Mission: design a paper-like ToC (few, thick chapters) that a reader would expect.
- •Do: budget H2/H3 counts; ensure each H3 is writeable (has a real comparison lens + evaluation angle).
- •Avoid: H3 explosion (many tiny buckets) and generic axis lists repeated everywhere.
- •
Writer Proxy
- •Mission: simulate the downstream writer and ask: “Could I draft this H3 without guessing?”
- •Do: make each H3’s bullets encode tension + contrasts + evaluation anchors + failure modes.
- •Avoid: bullets that sound like narration (“This subsection…”) or slide transitions (“Next, we…”).
- •
Scope Guardian
- •Mission: prevent silent scope drift.
- •Do: make in/out scope cues explicit in bullets (especially for boundaries like single-agent vs multi-agent, tool use vs RAG, etc.).
- •Avoid: leaving scope implicit and hoping the writer fixes it in prose.
When to use
- •You have a taxonomy and need an outline for mapping papers and building evidence.
- •You want each subsection to have concrete “coverage requirements” (axes, comparisons, evaluation).
When not to use
- •You already have an approved outline (don’t rewrite for style).
Input
- •
outline/taxonomy.yml - •Optional style references (paper-like section sizing):
- •
ref/agent-surveys/STYLE_REPORT.md - •
ref/agent-surveys/text/
- •
Output
- •
outline/outline.yml
Workflow (heuristic)
Uses: outline/taxonomy.yml.
Optional style calibration (recommended for paper-like structure):
- •Read
ref/agent-surveys/STYLE_REPORT.mdto sanity-check top-level section counts and typical subsection sizing. - •Skim 1–2 examples under
ref/agent-surveys/text/to imitate structure (not wording).- •Target final ToC: ~6–8 H2 sections.
- •Note: this pipeline appends
Discussion+Conclusionas global sections in C5 merge, so keep the outline itself <=6 H2 sections (often 5–6 including Intro+Related).
- •Translate taxonomy nodes into section headings that read like a survey structure.
- •For each H3 subsection, write bullets using the Stage A contract (verifiable, no prose paragraphs).
- •Minimum required bullets (first 4):
- •
Intent:what the reader should learn (subsection-specific). - •
RQ:the question this subsection answers (1 line). - •
Evidence needs:what kinds of evidence must appear later (benchmarks/metrics/protocols/failure modes). - •
Expected cites:expected cite density / cite types (avoid placeholders like TBD/TODO).
- •
- •Then add 2–6 subsection-specific bullets (comparisons/axes/eval anchors/failure modes).
- •Minimum required bullets (first 4):
- •For each subsection, ensure bullets are:
- •topic-specific (names of mechanisms, tasks, benchmarks, failure modes)
- •checkable (someone can verify whether the subsection covered it)
- •useful for mapping (papers can be assigned to each bullet/axis)
- •Prefer bullets that force synthesis later:
- •“Compare X vs Y along axes A/B/C”
- •“What evaluation setups are standard, and what they miss”
- •“Where methods fail (latency, tool errors, jailbreaks, reward hacking…)”
Quality checklist
- •
outline/outline.ymlexists and is bullets-only (no paragraphs). - • Every subsection has the Stage A bullets:
Intent:/RQ:/Evidence needs:/Expected cites:. - • Every subsection has ≥3 additional non-generic bullets after the Stage A fields.
- • Bullets are not copy-pasted templates across subsections.
Common failure modes (and fixes)
- •Template bullets everywhere → replace with domain terms + evaluation axes specific to that subsection.
- •Bullets too vague (“Discuss limitations”) → name which limitations and how to test them.
- •Outline too flat/too deep → aim for a paper-like ToC (final ~6–8 H2) with fewer, thicker H3s.
- •Too many H3 subsections → merge adjacent H3s and write fewer, thicker subsections (paper-like default; budget depends on queries.md draft_profile: survey<=10, deep<=12).
- •Missing Stage A fields → add
Intent/RQ/Evidence needs/Expected citesbullets so later mapping/evidence drafting can be audited.
Helper script (optional)
Quick Start
- •
python .codex/skills/outline-builder/scripts/run.py --help - •
python .codex/skills/outline-builder/scripts/run.py --workspace <workspace_dir>
All Options
- •See
--help(this helper is intentionally minimal)
Examples
- •Generate a baseline bullets-only outline, then refine bullets:
- •Run the helper once, then replace every generic bullet /
TODOwith topic-specific, checkable bullets.
- •Run the helper once, then replace every generic bullet /
Notes
- •The script generates a baseline bullets-only outline and never overwrites non-placeholder work.
- •Paper-like default: it inserts
IntroductionandRelated Workas fixed H2 sections before taxonomy-driven chapters. - •In
pipeline.py --strictit will be blocked only if placeholder markers (TODO/TBD/FIXME/(placeholder)) remain.
Refinement marker (recommended; completion signal)
When you are satisfied with the outline (and after C2 approval if applicable), create:
- •
outline/outline.refined.ok
This is an explicit "I reviewed/refined this" signal:
- •makes it harder for a scaffold-y outline to silently pass in strict runs
- •documents that bullets were rewritten into subsection-specific, checkable requirements
Troubleshooting
Common Issues
Issue: Outline still has TODO / scaffold bullets
Symptom:
- •Quality gate blocks
outline_scaffold.
Causes:
- •Helper script generated a scaffold; bullets were not rewritten.
Solutions:
- •Replace every generic bullet with topic-specific, checkable bullets (axes, comparisons, evaluation setups, failure modes).
- •Keep bullets-only (no prose paragraphs).
Issue: Outline bullets are mostly generic templates
Symptom:
- •Quality gate blocks
outline_template_bullets.
Causes:
- •Too many “Define problem…/Benchmarks…/Open problems…” template bullets.
Solutions:
- •Add concrete terms, datasets, evaluation metrics, and known failure modes per subsection.
Recovery Checklist
- • Every subsection has ≥3 non-template bullets.
- • No
TODO/(placeholder)remains.