Recursive Long Context Processing Skill

Process arbitrarily long inputs by treating them as an external environment that can be programmatically examined, decomposed, and recursively processed. Based on the Recursive Language Models (RLM) research from MIT CSAIL.

Quick Start

Installation

bash

cd Skills/recursive-long-context
pip install -r requirements.txt

Set API Key

bash

export ANTHROPIC_API_KEY="your-api-key"  # For Anthropic Claude
# OR
export OPENAI_API_KEY="your-api-key"     # For OpenAI

Python Usage

python

from rlm import run_rlm, RecursiveLanguageModel, RLMConfig

# Simple usage
answer = run_rlm(
    query="What are the main findings in this document?",
    context=your_long_document,
    api_provider="anthropic",
    verbose=True
)

# Advanced usage with configuration
config = RLMConfig(
    api_provider="anthropic",
    root_model="claude-sonnet-4-20250514",
    sub_model="claude-sonnet-4-20250514",
    max_iterations=20,
    verbose=True
)

rlm = RecursiveLanguageModel(config)
answer = rlm.run(query, context)

Command Line Usage

bash

python cli.py -q "Summarize the main findings" -f document.txt
python cli.py -q "Find all mentions of pricing" -d ./documents/
python cli.py -q "Count entries by category" -f data.txt --provider anthropic --max-iterations 15

Core Concept

Traditional LLMs struggle with long contexts due to "context rot." RLMs solve this by:

•Treating the prompt as an environment variable - Long input loaded as string variable, not fed directly to model
•Programmatic interaction - Write code to peek, filter, and decompose the context
•Recursive sub-calls - Invoke LLM calls on smaller chunks for semantic analysis
•Result aggregation - Combine findings into comprehensive answer

Task Complexity Guide

Complexity	Description	Strategy
O(1)	Find single piece of info	Keyword filtering + targeted analysis
O(N)	Examine every entry	Chunked processing with aggregation
O(N²)	Compare pairs of entries	Pairwise classification

Higher complexity tasks benefit most from RLM processing.

Information Gathering

Before processing, gather:

Context Info:

•Type of content (documents, code, data, mixed)
•Approximate size (characters, lines, documents)
•Structure (delimiters, headers, format)

Task Info:

•What question needs answered?
•Dense or sparse access required?
•Expected output format

Preferences:

•Speed vs thoroughness priority
•Budget constraints on API calls

Processing Strategies

See code-patterns.md for detailed implementation code.

Strategy 1: Keyword Filtering + Targeted Analysis

Best for: O(1) complexity, finding specific information

•Use code to filter relevant chunks by keyword
•Search for multiple related keywords
•Analyze promising chunks with LLM sub-calls

Strategy 2: Chunked Processing with Aggregation

Best for: O(N) complexity, examining all content

•Determine chunking strategy based on content structure
•Process in batches with LLM sub-calls
•Aggregate all findings into final answer

Strategy 3: Hierarchical Decomposition

Best for: Complex multi-hop reasoning, document collections

•First pass: High-level structure analysis
•Second pass: Targeted deep analysis by section
•Final synthesis across all summaries

Strategy 4: Pairwise Analysis

Best for: O(N²) complexity, finding relationships

•Classify all entries into categories
•Build classification index
•Generate pairs meeting criteria

LLM Sub-Query Best Practices

•Specific prompts: Request exact output format, one category at a time
•Batch queries: Process 50-100 items per LLM call to reduce costs
•Context sizing: Keep chunks under 400K characters for sub-calls
•Filter first: Use code to eliminate irrelevant content before LLM calls

Output Format

Provide:

•Direct Answer: Clear response to original question
•Methodology Summary: Strategy used, batches processed, LLM sub-calls made
•Confidence Assessment: High/Medium/Low with explanation

Cost Optimization

Approach	Quality	Cost	Speed
Every line individually	Highest	Highest	Slowest
Batches of 50-100	High	Medium	Medium
Batches of 500+	Medium	Low	Fast
Keyword filter only	Varies	Lowest	Fastest

Recommendation: Start with batches of 50-100 items for semantic tasks.

References

•Paper: "Recursive Language Models" (arXiv:2512.24601)
•Authors: Alex L. Zhang, Tim Kraska, Omar Khattab (MIT CSAIL)
•Key Insight: Long prompts should be treated as part of the environment that the LLM can symbolically interact with