Context Optimizer

Optimize data file structure and formatting for Claude's reasoning and information extraction.

Core Principle

Claude's attention follows a recency gradient: strongest at context boundaries (start/end), weakest in the middle. Optimization strategy:

•Position - Where content appears matters more than format choice
•Structure - Explicit delimiters enable selective retrieval
•Metadata - Source attribution enables grounding in evidence

Quick Reference

Content Type	Optimal Position	Recommended Format
Reference documents	TOP of context	XML with metadata
Instructions	After documents	Markdown/plain text
Examples	Between docs & instructions	XML with `<examples>`
Query/task	END of context	Plain text
State/status data	Any	JSON
Progress notes	Any	Unstructured text
Tabular data	TOP	CSV or markdown tables

Workflow

Step 1: Classify Content

Determine content type and select format:

Structured data (schemas, records, API responses) → JSON Multi-document corpus (reports, research, contracts) → XML with metadata Tabular data (metrics, comparisons) → CSV or markdown tables Configuration (settings, parameters) → YAML Narrative/progress → Unstructured markdown

Step 2: Apply Positioning Rule

code

┌─────────────────────────────────────┐
│  DOCUMENTS/DATA (position: TOP)    │  ← Strongest attention
├─────────────────────────────────────┤
│  EXAMPLES (if applicable)          │
├─────────────────────────────────────┤
│  INSTRUCTIONS                      │
├─────────────────────────────────────┤
│  QUERY/TASK (position: END)        │  ← Strongest attention
└─────────────────────────────────────┘

Placing queries at the end improves response quality by up to 30%.

Step 3: Structure Documents

For multi-document contexts, use the standard wrapper pattern:

xml

<documents>
  <document index="1" priority="primary">
    <source>filename.pdf</source>
    <type>financial_report</type>
    <content>
      {{DOCUMENT_CONTENT}}
    </content>
  </document>
</documents>

Always include:

•index - For unambiguous reference
•source - For citation/attribution
•type - For semantic filtering

See references/format-patterns.md for complete templates.

Step 4: Add Retrieval Instructions

For long documents, add explicit grounding instructions:

code

Find quotes from the documents that are relevant to [TASK].
Place these in <quotes> tags with document index references.
Then, based on these quotes, [PERFORM ANALYSIS].

This helps Claude cut through noise and ground responses in evidence.

Format Selection Guide

When to Use XML

•Multi-document corpora requiring source attribution
•Content with hierarchical structure
•When parseability of output matters
•Combining with chain-of-thought (<thinking>, <answer>)

When to Use JSON

•State tracking (test results, task status)
•API request/response data
•When schema enforcement is needed
•Structured output requirements

When to Use Markdown Tables

•Comparisons and rankings
•Tabular data under ~50 rows
•When human readability matters

When to Use Plain Text

•Instructions and queries
•Progress notes and narrative
•Simple, single-purpose content

Critical Rules

•Never bury instructions in data - Instructions surrounded by data get attention-diluted
•Reference documents explicitly - Avoid "this document"; use document index="3"
•Use consistent tag names - Same vocabulary throughout; reference by name in instructions
•Nest semantically - <outer><inner></inner></outer> for hierarchical content
•Include source attribution - Every document wrapper needs <source>

Common Pitfalls

See references/pitfalls.md for detailed mitigations.

Pitfall	Quick Fix
Lost in the middle	Put critical info at START or END
Ambiguous references	Use explicit index numbers
Context overflow	Pre-calculate tokens; split strategically
Format mismatch	Match structure complexity to data complexity

Transformation Script

For automated restructuring, use the transform script:

bash

python scripts/transform_context.py input.json --format xml --output optimized.xml
python scripts/transform_context.py input.csv --format xml --add-metadata --output optimized.xml

See script help for all options: python scripts/transform_context.py --help