<Use_When>
- •User says "read this paper", "summarize paper", "review paper", "analyze this research"
- •User provides an arxiv URL, DOI link, or PDF file path
- •User asks "what does this paper say about X?"
- •User wants to extract methodology, findings, or limitations from a specific paper
- •User mentions a paper title and wants structured analysis
- •User wants to add a paper to their knowledge base </Use_When>
<Do_Not_Use_When>
- •Multi-paper comparison or synthesis across papers -- use
lit-reviewinstead - •Quick citation lookup or "who wrote X?" -- use research-assistant agent directly
- •General topic research without a specific paper -- use
researchskill instead - •Reading non-academic documents (blog posts, docs) -- use standard Read tool
- •Data analysis of results from papers -- use
research-analysisinstead </Do_Not_Use_When>
<Why_This_Exists> Researchers need more than summaries. They need structured extraction -- methodology details, statistical evidence quality, specific limitations, and how each paper connects to others they have read. Without systematic extraction, papers are read once and forgotten, connections between findings are missed, and the same paper gets re-read months later. This skill creates a persistent, queryable knowledge base from every paper reviewed. </Why_This_Exists>
<Execution_Policy>
- •Always check memory first for existing entries on the same paper (avoid duplicates)
- •Extract ALL structured sections, not just abstract-level summaries
- •Store entities and relations in the knowledge graph, not just flat text
- •Link new papers to existing papers in the graph when topic overlap exists
- •Preserve exact statistical claims with their confidence intervals and p-values
- •Flag methodological concerns explicitly in the LIMITATION section
- •Default model routing: paper-reader agent at sonnet tier for extraction, memory-curator at haiku for storage </Execution_Policy>
- •
Check Existing: Search memory for this paper
codesc_memory_search(query="<paper title or ID>", category="paper")
- •If found: Show existing entry, ask if user wants to update or view
- •If not found: Proceed with extraction
- •
Extract Structure: Delegate to paper-reader agent (sonnet) to extract:
- •[PAPER] Title, authors, year, venue, DOI/URL
- •[ABSTRACT] Core claim in 2-3 sentences
- •[METHOD] Methodology details -- dataset, approach, baselines, evaluation metrics
- •[FINDING] Key findings with statistical evidence (exact numbers, p-values, CIs)
- •[LIMITATION] Stated and unstated limitations, threats to validity
- •[CONTRIBUTION] Novel contributions claimed by authors
- •[CITATION_KEY] Papers this work builds on (for graph linking)
- •
Store in Knowledge Graph: Use memory-curator agent to persist
- •Create paper entity:
sc_memory_add_entity(name="<title>", type="paper", properties={...}) - •Create method relation:
sc_memory_add_relation(from="<paper>", to="<method>", type="uses_method") - •Create finding relations:
sc_memory_add_relation(from="<paper>", to="<finding>", type="reports_finding") - •Store full extraction:
sc_memory_store(content="<full extraction>", category="paper", confidence=0.9)
- •Create paper entity:
- •
Connect to Existing Papers: Query graph for related work
codesc_memory_graph_query(query="papers about <topic>")
- •Add
cites,builds_on,contradicts,extendsrelations where applicable - •Add
same_topicrelations for topically related papers
- •Add
- •
Report to User: Format structured summary
- •One-paragraph overview
- •Key findings table with evidence quality ratings
- •Methodology summary
- •Limitations and caveats
- •Connections to previously reviewed papers </Steps>
<Tool_Usage>
- •
sc_memory_search-- Check for existing paper entries before extraction - •
sc_memory_store-- Store full structured extraction with category="paper" - •
sc_memory_add_entity-- Create paper, method, and finding entities in the graph - •
sc_memory_add_relation-- Link papers to methods, findings, and other papers - •
sc_memory_graph_query-- Find related papers and build citation connections - •
WebFetch-- Fetch arxiv abstracts, DOI-resolved pages, open-access PDFs - •
WebSearch-- Locate papers by title when no direct URL is provided - •
Read-- Read local PDF files or cached paper documents - •
Grep-- Search within large paper texts for specific sections or claims </Tool_Usage>
<Escalation_And_Stop_Conditions>
- •If the paper is behind a paywall and cannot be accessed, report the limitation and offer to work from the abstract only (with reduced confidence score)
- •If the paper is in a non-English language, note this and attempt extraction but flag reduced accuracy
- •If the paper is extremely long (>50 pages, e.g., a survey), recommend using
lit-reviewskill instead or ask user which sections to focus on - •If memory storage fails, report the extraction results directly to the user and retry storage
- •If the paper source cannot be resolved (broken URL, missing PDF), ask the user for an alternative source </Escalation_And_Stop_Conditions>
<Final_Checklist>
- • Paper source identified and content accessed
- • Memory checked for existing entry (no duplicates)
- • All structured sections extracted: [PAPER], [METHOD], [FINDING], [LIMITATION], [CONTRIBUTION]
- • Statistical claims include exact numbers and confidence measures
- • Paper entity created in knowledge graph
- • Method and finding relations added
- • Connections to existing papers established where applicable
- • Formatted summary presented to user
- • Confidence score reflects extraction quality (full text > abstract only) </Final_Checklist>
| Source | Method | Quality |
|---|---|---|
| Arxiv URL | WebFetch abstract + PDF link | High (full text usually available) |
| DOI link | WebFetch resolved URL | Varies (may hit paywall) |
| Local PDF | Read tool | Highest (full text guaranteed) |
| Paper title | WebSearch to locate | Medium (depends on search results) |
| Semantic Scholar URL | WebFetch API | High (structured metadata) |
Extraction Accuracy Tips
- •Full text extraction is always preferred over abstract-only
- •For papers with complex tables, explicitly extract table data into structured format
- •Statistical claims should preserve exact notation: "F1=0.847, p<0.001, 95% CI [0.82, 0.87]"
- •When authors state limitations, quote them directly rather than paraphrasing
- •Cross-reference claimed contributions against actual evidence in results
Figure and Table Extraction
When processing PDFs with the Read tool:
- •Tables: Extract as markdown tables with headers preserved
- •Figures: Note figure descriptions and captions (visual content requires vision agent)
- •For figure analysis, delegate to vision agent with screenshot of the figure
Dealing with Paywalled Papers
- •Try the DOI through Sci-Hub alternatives or institutional access
- •Check if a preprint exists on arxiv or author's website
- •Fall back to abstract-only extraction with
confidence: 0.5 - •Note in the extraction that full text was not available
Knowledge Graph Schema
Entity: Paper
Properties: title, authors, year, venue, doi, url, abstract
Relations:
- uses_method -> Method
- reports_finding -> Finding
- has_limitation -> Limitation
- cites -> Paper
- builds_on -> Paper
- contradicts -> Paper
- extends -> Paper
- same_topic -> Paper
Entity: Method
Properties: name, description, category
Entity: Finding
Properties: claim, evidence, p_value, confidence_interval, effect_size
Troubleshooting
Paper not found via WebSearch?
- •Try searching with exact title in quotes
- •Try adding author names to the search
- •Try searching on Google Scholar directly
Extraction quality low?
- •Check if full text was accessed or just abstract
- •For dense papers, run extraction twice with different focus areas
- •Consider splitting long papers into sections for targeted extraction
Memory storage failing?
- •Check
sc_memory_statsfor capacity - •Verify entity names do not conflict with existing entries
- •Try storing with a shorter content field and linking to a file for full text