Paper Notes

Produce consistent, searchable paper notes that later steps (claims, visuals, writing) can reliably synthesize.

This is still NO PROSE: keep notes as bullets / short fields, not narrative paragraphs.

Role cards (prompt-level guidance)

•
Close Reader
- •Mission: extract what is specific and checkable (setup, method, metrics, limits).
- •Do: name concrete tasks/benchmarks and what the paper actually measures.
- •Avoid: generic summary boilerplate that could fit any paper.
•
Results Recorder
- •Mission: capture evaluation anchors that later writing needs.
- •Do: record task + metric + constraints (budget/tool access) whenever available.
- •Avoid: copying numbers without the evaluation setting that makes them meaningful.
•
Limitation Logger
- •Mission: capture the caveats that change interpretation.
- •Do: write paper-specific limitations (protocol mismatch, missing ablations, threat model gaps).
- •Avoid: repeated generic limitations like “may not generalize” without specifics.

•After you have a core set (and ideally a mapping) and need evidence-ready notes.
•Before writing a survey draft.

•papers/core_set.csv
•Optional: outline/mapping.tsv (to prioritize)
•Optional: papers/fulltext_index.jsonl + papers/fulltext/*.txt (if running in fulltext mode)

•papers/paper_notes.jsonl (JSONL; one record per paper)
•papers/evidence_bank.jsonl (JSONL; addressable evidence snippets derived from notes; A150++ target: >=7 items/paper on average)

•If you have extracted text (papers/fulltext/*.txt) → enrich key papers using fulltext snippets and set evidence_level: "fulltext".
•If you only have abstracts (default) → keep long-tail notes abstract-level, but still fully enrich high-priority papers (see below).

Uses: outline/mapping.tsv, papers/fulltext_index.jsonl.

•Ensure coverage: every paper_id in papers/core_set.csv must have one JSONL record.
•
Use mapping to choose high-priority papers:
- •heavily reused across subsections
- •pinned classics (ReAct/Toolformer/Reflexion… if in scope)
•
For high-priority papers, capture:
- •3–6 summary bullets (what’s new, what problem setting, what’s the loop)
- •method (mechanism and architecture; what differs from baselines)
- •key_results (benchmarks/metrics; include numbers if available)
- •limitations (specific assumptions/failure modes; avoid generic boilerplate)
•
For long-tail papers:
- •keep summary bullets short (abstract-derived is OK)
- •still include at least one limitation, but make it specific when possible
•Assign a stable bibkey for each paper for citation generation.

•
Coverage: every paper_id in papers/core_set.csv appears in papers/paper_notes.jsonl.
•
High-priority papers have non-TODO method/results/limitations.
•
Limitations are not copy-pasted across many papers.
•
evidence_level is set correctly (abstract vs fulltext).
•
Evidence bank: papers/evidence_bank.jsonl exists and is dense enough for A150++ (>=7 items/paper on average).

•
Generate notes, then optionally enrich priority=high papers:
- •Run the helper once, then refine papers/paper_notes.jsonl (e.g., add full-text details for key papers and diversify limitations).

•The helper writes deterministic metadata/abstract-level notes and marks key papers with priority=high.
•In pipeline.py --strict it will be blocked if high-priority notes are incomplete (missing method/key_results/limitations) or contain placeholders.

Symptom:

Causes:

Solutions:

•Fully enrich priority=high papers: method, ≥1 key_results, ≥3 summary_bullets, ≥1 concrete limitations.
•If you need full text evidence, run pdf-text-extractor in fulltext mode for key papers.

Symptom:

Causes:

Solutions:

•Replace boilerplate with paper-specific limitations (setup, data, evaluation gaps, failure cases).

• papers/paper_notes.jsonl covers all papers/core_set.csv paper_ids.
• ≥80% of priority=high notes satisfy method/results/limitations completeness.
• No TODO remains in high-priority notes.