Context Optimizer
Optimize data file structure and formatting for Claude's reasoning and information extraction.
Core Principle
Claude's attention follows a recency gradient: strongest at context boundaries (start/end), weakest in the middle. Optimization strategy:
- •Position - Where content appears matters more than format choice
- •Structure - Explicit delimiters enable selective retrieval
- •Metadata - Source attribution enables grounding in evidence
Quick Reference
| Content Type | Optimal Position | Recommended Format |
|---|---|---|
| Reference documents | TOP of context | XML with metadata |
| Instructions | After documents | Markdown/plain text |
| Examples | Between docs & instructions | XML with <examples> |
| Query/task | END of context | Plain text |
| State/status data | Any | JSON |
| Progress notes | Any | Unstructured text |
| Tabular data | TOP | CSV or markdown tables |
Workflow
Step 1: Classify Content
Determine content type and select format:
Structured data (schemas, records, API responses) → JSON Multi-document corpus (reports, research, contracts) → XML with metadata Tabular data (metrics, comparisons) → CSV or markdown tables Configuration (settings, parameters) → YAML Narrative/progress → Unstructured markdown
Step 2: Apply Positioning Rule
┌─────────────────────────────────────┐ │ DOCUMENTS/DATA (position: TOP) │ ← Strongest attention ├─────────────────────────────────────┤ │ EXAMPLES (if applicable) │ ├─────────────────────────────────────┤ │ INSTRUCTIONS │ ├─────────────────────────────────────┤ │ QUERY/TASK (position: END) │ ← Strongest attention └─────────────────────────────────────┘
Placing queries at the end improves response quality by up to 30%.
Step 3: Structure Documents
For multi-document contexts, use the standard wrapper pattern:
<documents>
<document index="1" priority="primary">
<source>filename.pdf</source>
<type>financial_report</type>
<content>
{{DOCUMENT_CONTENT}}
</content>
</document>
</documents>
Always include:
- •
index- For unambiguous reference - •
source- For citation/attribution - •
type- For semantic filtering
See references/format-patterns.md for complete templates.
Step 4: Add Retrieval Instructions
For long documents, add explicit grounding instructions:
Find quotes from the documents that are relevant to [TASK]. Place these in <quotes> tags with document index references. Then, based on these quotes, [PERFORM ANALYSIS].
This helps Claude cut through noise and ground responses in evidence.
Format Selection Guide
When to Use XML
- •Multi-document corpora requiring source attribution
- •Content with hierarchical structure
- •When parseability of output matters
- •Combining with chain-of-thought (
<thinking>,<answer>)
When to Use JSON
- •State tracking (test results, task status)
- •API request/response data
- •When schema enforcement is needed
- •Structured output requirements
When to Use Markdown Tables
- •Comparisons and rankings
- •Tabular data under ~50 rows
- •When human readability matters
When to Use Plain Text
- •Instructions and queries
- •Progress notes and narrative
- •Simple, single-purpose content
Critical Rules
- •Never bury instructions in data - Instructions surrounded by data get attention-diluted
- •Reference documents explicitly - Avoid "this document"; use
document index="3" - •Use consistent tag names - Same vocabulary throughout; reference by name in instructions
- •Nest semantically -
<outer><inner></inner></outer>for hierarchical content - •Include source attribution - Every document wrapper needs
<source>
Common Pitfalls
See references/pitfalls.md for detailed mitigations.
| Pitfall | Quick Fix |
|---|---|
| Lost in the middle | Put critical info at START or END |
| Ambiguous references | Use explicit index numbers |
| Context overflow | Pre-calculate tokens; split strategically |
| Format mismatch | Match structure complexity to data complexity |
Transformation Script
For automated restructuring, use the transform script:
python scripts/transform_context.py input.json --format xml --output optimized.xml python scripts/transform_context.py input.csv --format xml --add-metadata --output optimized.xml
See script help for all options: python scripts/transform_context.py --help