AgentSkillsCN

citation-verifier

从论文笔记中生成并验证 BibTeX 条目,同时撰写 `citations/ref.bib` 与 `citations/verified.jsonl` 文件。 **触发条件**:引用、BibTeX、ref.bib、verified.jsonl、参考文献、引用、参考文献。 **适用场景**:当你已拥有 `papers/paper_notes.jsonl` 文件,需要为散文/LaTeX 文稿准备可追溯的引用(每条引用都附带 URL/日期/标题的验证记录)时。 **跳过条件**:若尚未生成论文笔记(或本次产出无需引用/参考文献)。 **网络要求**:自动验证通常需要联网;若无网络,可先记录信息,再标注“需人工验证”。 **防护措施**:每条 BibTeX 条目都必须对应一条 `citations/verified.jsonl` 记录;散文中只能使用已存在于 `citations/ref.bib` 中的引用键。

SKILL.md
--- frontmatter
name: citation-verifier
description: |
  Generate and verify BibTeX entries from paper notes, writing `citations/ref.bib` and `citations/verified.jsonl`.
  **Trigger**: citation, BibTeX, ref.bib, verified.jsonl, references, 引用, 参考文献.
  **Use when**: 已有 `papers/paper_notes.jsonl`,需要为 prose/LaTeX 准备可追溯的引用(每条都有 url/date/title 验证记录)。
  **Skip if**: 还没有 paper notes(或本次产出不需要引用/参考文献)。
  **Network**: 自动验证通常需要网络;无网络时可先 record,再标注 needs manual verification。
  **Guardrail**: 每个 BibTeX entry 必须对应一条 `citations/verified.jsonl` 记录;prose 只能使用已存在于 `citations/ref.bib` 的 citation keys。

Citation Verifier

Generate citations/ref.bib and ensure every entry has a traceable verification record in citations/verified.jsonl.

When network access is restricted, prefer a “record now, verify later” workflow: keep URLs/titles consistent and leave a clear verification note.

Input

  • papers/paper_notes.jsonl

Outputs

  • citations/ref.bib
  • citations/verified.jsonl

Workflow (heuristic)

  1. Collect bibkey, title, url, year, authors from papers/paper_notes.jsonl.
  2. Write/refresh citations/ref.bib:
    • Prefer arXiv-style fields when arxiv_id / primary_category exist (eprint, archivePrefix, primaryClass).
  3. Write one verification record per BibTeX entry to citations/verified.jsonl with at least:
    • bibkey, title, url, date
  4. If you cannot verify via network, record a clear notes field (e.g., “auto-generated; needs manual verification”) and/or request human confirmation depending on your policy.

Quality checklist

  • Every BibTeX entry has a corresponding verified.jsonl record.
  • No missing url/date/title in verification records.

Offline Mode

When network access is restricted, run in offline mode to produce auditable records now, then verify later.

  • Generate offline records: verification_status: offline_generated
  • Verify later (when network is available): --verify-only

verification_status

  • offline_generated: record was generated without network verification (needs later verification)
  • verified_online: URL/title verified successfully by the script
  • verify_failed: verification was attempted but failed (network error or title mismatch)
  • needs_manual_verification: missing/ambiguous fields (e.g., empty url/title)

Script

Quick Start

  • python .codex/skills/citation-verifier/scripts/run.py --help
  • Offline (record now, verify later): python .codex/skills/citation-verifier/scripts/run.py --workspace <workspace_dir> --offline

All Options

  • --offline: do not attempt network verification; write verification_status=offline_generated
  • --verify-only: verify existing citations/verified.jsonl records (does not rewrite BibTeX)
  • --verification-note <text>: stored in citations/verified.jsonl notes

Examples

  • Generate BibTeX + offline verification records:
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --offline --verification-note "auto-generated; needs manual verification"
  • Later, verify-only (when network is available):
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --verify-only

Notes

  • Minimal requirement for every verification record: url, date, title.
  • The script sanitizes stray/unbalanced {} in titles to keep bibtex parsing robust.
  • The script escapes LaTeX special chars in text fields (& % $ # _) and rewrites superscript patterns like X^N or X$^N$ as X\textsuperscript{N} to keep LaTeX builds stable.
  • URLs are kept raw in BibTeX url fields (BibTeX styles wrap them with \url{...}); @misc uses howpublished=\url{...}.
  • In offline mode, records are not truly verified; treat offline_generated as a to-do for human/network verification.

Troubleshooting

Common Issues

Issue: Missing bibkey / missing url in notes

Symptom:

  • citations/ref.bib is missing entries, or verified.jsonl has empty url/title.

Causes:

  • papers/paper_notes.jsonl lacks bibkey/url fields.

Solutions:

  • Ensure each core paper note has a stable bibkey and a canonical url.
  • Rerun citation generation after fixing notes.

Issue: verification_status=offline_generated

Symptom:

  • Records exist but are not truly verified.

Causes:

  • --offline was used, or network verification was unavailable.

Solutions:

  • When network is available, run --verify-only to upgrade records.
  • Or manually verify and update citations/verified.jsonl with notes.

Recovery Checklist

  • Every BibTeX entry has a matching citations/verified.jsonl record.
  • Verification records include url, date, title.