PDF Figure & Table Extraction Skill

Extract figures and tables from academic PDF papers with pixel-perfect precision using the Qwen3-VL vision-language model. Features multi-round quality assessment for optimal extraction accuracy.

Quick Start

bash

# Extract a complete figure (including title, legend, notes)
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "<pdf_path>" "<figure_name>"

# Extract main content only (no title, legend, notes)
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "<pdf_path>" "<figure_name>" --no-extras

# Batch extraction with output directory
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "<pdf_path>" --batch "Figure 1,Figure 2,Table 1" -d "output/asset/"

Features

1. Extraction Modes

Mode	Flag	Description
Complete	(default)	Includes figure number, caption, legend, notes, and main content
Content Only	`--no-extras`	Only the chart/diagram/table data, without surrounding text

2. Multi-Round Quality Assessment

The tool uses an iterative refinement process:

•Round 1: Initial figure detection and bounding box estimation
•Round 2+: Quality assessment with visual feedback (red bounding box overlay)
•Refinement: If quality score < 8/10, coordinates are automatically adjusted
•Termination: Stops when quality is satisfactory or max rounds reached (default: 3)

3. Supported Figure Types

•Figure 1, Figure 2, etc.
•Table 1, Table 2, etc.
•Fig. 1, Fig 1 (abbreviated)
•Figure 1(a), Figure 1 (a), Figure 1a (sub-figures)
•Table 1(a) (sub-tables)

Usage Examples

Single Figure Extraction

bash

# Complete figure with all elements
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Figure 1"

# Only the chart content (no caption)
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Figure 1" --no-extras

# With custom output path
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Table 2" -o "output/asset/Table_2.png"

# Higher resolution
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Figure 3" --dpi 400

Sub-Figure Extraction

bash

# Extract only Figure 1(a) from a composite figure
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Figure 1(a)"

# Sub-figure with custom output
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" "Figure 2(b)" -o "output/asset/Figure_2b.png"

Batch Extraction

bash

# Extract multiple figures/tables at once
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" --batch "Figure 1,Figure 2,Figure 3,Table 1,Table 2" -d "output/asset/"

# Batch without extras
python .claude/skills/extract-pdf-figure/scripts/extract_figures.py "PDFs/paper.pdf" --batch "Figure 1,Figure 2" --no-extras -d "output/asset/"

Command Line Options

Option	Short	Description
`pdf_path`		PDF file path (required)
`figure_name`		Figure/Table name (required for single extraction)
`--output`	`-o`	Output image path
`--dpi`		Rendering resolution (default: 300)
`--batch`	`-b`	Comma-separated figure names for batch extraction
`--output-dir`	`-d`	Output directory for batch extraction
`--no-extras`		Exclude title, legend, notes (extract main content only)
`--max-rounds`		Max quality refinement rounds (default: 3)

Output

•Format: PNG images
•Default location: extracted_figures/ directory next to the PDF
•Naming: <pdf_name>_<figure_name>.png

Dependencies

依赖已在项目 requirements.txt 中定义，安装方法请参考 README.md。

主要依赖：

•PyMuPDF - PDF 解析
•Pillow - 图像处理
•openai - API 客户端
•python-dotenv - 环境变量加载

How It Works

•PDF to Image: Each PDF page is rendered at high resolution (300 DPI default)
•AI Detection: Qwen3-VL model locates the target figure and returns bounding box coordinates
•Quality Check: The extraction is visually assessed; if needed, coordinates are refined
•Precise Cropping: The image is cropped with minimal padding to capture the exact figure
•Output: Saved as an optimized PNG file

Tips for Best Results

•Figure Names: Use exact names as they appear in the paper (e.g., "Figure 1", not "fig 1")
•
Composite Figures:
- •Use "Figure 1" to get all sub-figures with the main caption
- •Use "Figure 1(a)" to get only that specific sub-figure
•Tables: Caption is typically above, so the tool searches upward
•Quality: Use --dpi 400 for very detailed figures
•Content Only: Use --no-extras when you only need the visual data without surrounding text

Additional Resources

•Script location: scripts/extract_figures.py