Instructions
Overview
This skill validates and repairs LaTeX citation and reference integrity in a multi-file project. It systematically scans all .tex files and the .bib file to identify:
- •Broken Citations:
\cite{}commands referencing non-existent.bibentries. - •Empty Citations/References:
\cite{},\ref{}, or\autoref{}with empty braces. - •Broken References:
\ref{}or\autoref{}commands pointing to non-existent\label{}s. - •Incorrect References: References to labels that exist but are likely typos (e.g.,
fig:call-api-v0vs.fig:call-api).
The core logic is implemented in the bundled Python script validate_and_fix.py. You should primarily execute this script, which provides a deterministic, repeatable analysis and repair process.
Step-by-Step Procedure
- •
Initial Exploration: First, understand the workspace structure. List the allowed directories and examine the directory tree to locate all
.texand.bibfiles. - •
Run the Validation Script: Execute the primary script
/workspace/latex-citation-reference-validator/scripts/validate_and_fix.py. This script will:- •Recursively find all
.texfiles in the workspace. - •Read the
.bibfile to extract all citation keys. - •Parse all
.texfiles for\cite{},\label{},\ref{}, and\autoref{}commands. - •Perform a comprehensive analysis, identifying all four categories of issues listed above.
- •Automatically fix the identified issues by editing the source
.texfiles in place. - •Generate a detailed summary report of all changes made.
- •Recursively find all
- •
Review and Confirm: After the script runs, review its output summary. The summary will detail every file modified and the specific changes applied (e.g., "Fixed empty citation
\citep{}->\citep{brown2020language}"). You should verify that the fixes are correct and appropriate for the context of the paper. - •
Manual Verification (Optional): For complex projects or if you suspect edge cases, you may manually spot-check a few fixed locations in the
.texfiles to ensure the corrections are contextually accurate (e.g., the added citation keybrown2020languageis indeed the correct paper for the phrase "learn in-context").
Key Decisions & Heuristics
- •Empty Citation Fixing: The script uses a heuristic lookup table (
CITATION_FIX_MAPin the script) to map common phrases to probable citation keys (e.g., "learn in-context" ->brown2020language, "SimCSE retriever" ->gao2021simcse). You must review these automatic fixes for accuracy. If the heuristic fails, you will need to manually determine and apply the correct citation key. - •Broken Reference Resolution: The script identifies the closest matching existing label for a broken reference using string similarity. For example,
fig:call-api-v0will be corrected tofig:call-api. You must verify that the suggested correction is semantically correct (i.e., references the intended figure/table/section). - •Scope: The script operates on all
.texfiles within the/workspacedirectory. Ensure your target paper files are located there.
Triggers
Use this skill when the user request involves:
- •"Check my LaTeX citations and references."
- •"Fix broken
\ref{}commands in my paper." - •"Validate that all
\cite{}commands match my.bibfile." - •"Help me find missing labels or citations."
- •Working with file extensions
.texand.bib.
Bundled Resources
- •
scripts/validate_and_fix.py: The main analysis and repair script. Run this. - •
references/common_citations.md: A reference list of common NLP/ML paper citation keys and the contexts they typically appear in. Consult this if you need to manually determine a correct citation key not covered by the script's heuristics.