Datasheet Intelligence
Objective
- •Produce evidence-grounded hardware answers and code from
PDF/DOCX/XLSXdatasheets. - •Prefer fast mode by default; use
--structuredonly when table/header fidelity is required.
Context Policy
- •Keep
SKILL.mdminimal and procedural. - •Run
scripts/toc.py,scripts/search.py,scripts/read.pydirectly before loading extra references. - •Load
references/usage.mdonly for detailed flags or format-specific examples.
Prerequisites
Use uv with this skill's pyproject.toml and uv.lock.
Do not rely on PEP 723 inline script metadata.
bash
# Install uv if needed curl -LsSf https://astral.sh/uv/install.sh | sh
Use one of these execution contexts:
- •Recommended (works from any directory):
uv run --project skills/datasheet-intelligence ... - •Alternative:
cd skills/datasheet-intelligence && uv run ...
All command examples below use the recommended --project style.
Mandatory Execution Loop
- •MUST identify candidate pages first with
scripts/toc.pyorscripts/search.pybefore large reads. - •MUST read targeted ranges with
scripts/read.py --pagesand expand iteratively. - •MUST verify every critical claim (address, bit position, reset value, formula) with
source file + page/section. - •MUST rerun extraction on mismatch or ambiguity (
search -> read -> search). - •MUST follow
Tip:/Try:guidance from script errors, then rerun. - •MUST not finalize the answer until critical code settings are mapped to citations.
Workflow
PDF Datasheets
Choose the strategy by document size.
Small (< 50 pages)
- •Run
scripts/toc.py(use--structuredif bookmarks are missing). - •Run
scripts/read.py --pagesfor relevant sections. - •Add
--structuredif tables are broken.
Medium (50-150 pages)
- •Run
scripts/toc.py --filterto narrow sections. - •Run
scripts/search.pyto locate exact pages. - •Run
scripts/read.py --pagesfor focused ranges (use--structuredfor table-heavy ranges).
Large (>= 150 pages)
Never read the whole document at once.
- •Run
scripts/toc.pyto map sections and page ranges. - •If
scripts/toc.pyreports no bookmarks, switch immediately toscripts/search.py(search-first) instead of full--structuredTOC. - •Skip low-value sections (Legal, Revision History, Ordering Info, Package Drawing).
- •Run
scripts/search.pyfor exact keyword locations (--unique-pagesrecommended for long documents). - •Run
scripts/read.py --pagesin 10-30 page chunks. - •Iterate: read -> discover new keywords -> search again -> read again.
High-priority large-PDF sections:
| Priority | Section | Why |
|---|---|---|
| High | Register Map / List | Addresses, bit fields, reset values |
| High | Address Map | Base addresses, memory map |
| Medium | Pin Description / GPIO | Pin functions, function select |
| Medium | Electrical Characteristics | Voltage/current constraints |
| Medium | Clock / Timing | Timing formulas, divider rules |
| Low | Reset Controller | Reset release sequence |
| Lowest | Legal / Ordering / Revision | Usually not needed |
DOCX / XLSX
- •Run
scripts/search.pyfirst to find candidate paragraphs/rows. - •Run
scripts/read.pyfor targeted reading. - •Use
scripts/read.py --structuredwhen layout/table structure is critical. - •If no hits, expand keywords and retry search before full reading.
Quick Commands
Use --structured only when table/header fidelity is required.
bash
SKILL_DIR="skills/datasheet-intelligence" # 1) Find candidate pages first uv run --project "$SKILL_DIR" "$SKILL_DIR/scripts/search.py" docs/rp2040.pdf "IC_CON" "I2C0_BASE" --unique-pages # 2) Read only selected ranges uv run --project "$SKILL_DIR" "$SKILL_DIR/scripts/read.py" docs/rp2040.pdf --pages 464-470 # 3) Switch to structured mode only if layout fidelity is critical uv run --project "$SKILL_DIR" --with docling "$SKILL_DIR/scripts/read.py" docs/rp2040.pdf --pages 464-470 --structured
For full flags and format-specific examples, read references/usage.md.
Operational Rules
- •Start with TOC for PDF workflows.
- •If bookmarks are missing, switch to search-first flow and avoid full structured TOC for very large PDFs.
- •Keep explicit project context in every command (
uv run --project ...). - •Read enough neighboring context to avoid missing table headers/footnotes.
- •Cross-check register values against Address Map / Register List sections.
Output Contract
- •MUST provide evidence for each critical claim (address, bit position, reset value, formula):
source file + page/section. - •MUST map important code settings to evidence locations.
- •MUST mark unverifiable values as unverified.
- •MUST report table/prose conflicts and separate uncertain items.
Resources
| Script | Role | Formats | --structured |
|---|---|---|---|
scripts/toc.py | TOC extraction | PDF, DOCX | Yes |
scripts/read.py | Targeted reading | PDF, DOCX, XLSX | Yes |
scripts/search.py | Keyword search | PDF, DOCX, XLSX | Yes |
See usage.md for detailed examples.