AgentSkillsCN

homework-grading-workflow

通过视觉提取学生姓名、匹配花名册、为每位学生创建单独 PDF,并更新完成电子表格来处理扫描的家庭作业 PDF。触发词:“处理家庭作业 PDF”、“按学生整理”、“从扫描创建学生文件”、“更新家庭作业清单”、“谁提交了”、“排序作业”、“批改试卷”、“按学生分割 PDF”、“跟踪完成情况”、“缺少作业”。处理批量扫描的工作表、手写姓名识别、模糊花名册匹配、大批次会话恢复(99 张图片限制)、教师批改自动化。

SKILL.md
--- frontmatter
name: homework-grading-workflow
description: >-
  Process scanned homework PDFs by extracting student names via vision, matching to roster,
  creating individual PDFs per student, and updating completion spreadsheets.
  Triggers: "process homework PDF", "organize by student", "create student files from scan",
  "update homework checklist", "who submitted", "sort assignments", "grade papers",
  "split PDF by student", "track completion", "missing assignments".
  Handles: batched scanned worksheets, handwritten name recognition, fuzzy roster matching,
  session resume for large batches (99-image limit), teacher grading automation.

Homework Grading Workflow

Process scanned homework: extract names from pages using vision → match to roster → create individual PDFs → update completion spreadsheets.

Core accuracy rules: Read EVERY page individually (no batching) | Focus on Name field at TOP | Verify EACH PDF after creation

Required inputs: Scanned homework PDF + Student roster spreadsheet (with period sheets) + Completion spreadsheet (optional)

Workflow Overview

code
Phase 0: Planning (BEFORE ANYTHING ELSE)
├── Count pages in PDF (fitz.open → len(doc))
├── CREATE tracking file with batch plan
├── Calculate sessions needed: ceil(total_pages / 99)
├── Display plan to user:
│   "150 pages = 2 sessions (99 + 51)"
└── User confirms before proceeding

Phase 1: Setup
├── Load roster from xlsx skill
├── Extract PDF pages as images
└── Tracking file already exists from Phase 0

Phase 2: Page Analysis (CRITICAL)
├── Check tracking file for current_session and start_page
├── Read EACH page individually (max 99 per session)
├── Find Name field at top
├── Match to roster (fuzzy matching)
├── Save to tracking file after EACH page
└── Stop at session limit, inform user to resume

Phase 3: User Verification
├── Display uncertain pages
├── User confirms/corrects names
└── All pages must be assigned

Phase 4: Create PDFs
├── Group pages by student
├── Create individual PDFs
└── Skip Unknown pages

Phase 5: Update Spreadsheet
├── Add assignment column
├── Mark submissions with X
└── Update Total formula

Phase 6: Verification
├── Check each PDF has correct pages
├── Cross-reference spreadsheet
└── Cleanup temp files

Quick Reference

PhaseReference FileScript/Template
Decision Flowworkflow-flowchart.md-
Page Analysispage-analysis.mdscripts/extract_pages.py
Status Trackingstatus-tracking.mdscripts/update_status.py, templates/homework-grading-status.yaml
PDF Creationpdf-creation.mdscripts/create_student_pdfs.py
Spreadsheetspreadsheet-update.md-
Verificationverification.md-
Troubleshootingtroubleshooting.md-
Multi-Model Analysisclink-integration.mdPAL MCP clink tool

Required Skills Integration

Always invoke these skills:

code
Skill: document-skills:xlsx  # For spreadsheet operations
Skill: document-skills:pdf   # For PDF manipulation

Critical Rules

Phase 0 (BEFORE ANYTHING ELSE):

  • Create tracking file FIRST → Calculate batch plan (ceil(pages/99)) → Display plan → Get user confirmation

Page Analysis:

  • ONE page per Read call (never batch) | Focus on "Name:" at TOP | Save to tracking file after EACH page | Stop at 99-image limit

Verification:

  • Open and verify EACH student PDF | User MUST resolve uncertain pages | Cross-check spreadsheet X marks

Red Flags → STOP

If you see...Do this instead
No tracking file createdCreate tracking file FIRST with batch plan
Batching pagesRead ONE page per Read call
Unknown pages being processedUser MUST confirm names first
Skipping PDF verificationOpen and check EACH PDF
Not saving after each pageSave to tracking file after EVERY page

Confidence levels: high (exact match) → proceed | medium (fuzzy match) → proceed with note | low/unknown → flag for user review. See page-analysis.md for details. For difficult handwriting, use clink to get second opinion from another model - see clink-integration.md.

Session Resume

99-image limit per session. Tracking file ({output_folder}/homework-grading-status.yaml) enables automatic resume. See status-tracking.md for schema and resume workflow.

Output Structure

code
{output_folder}/
├── homework-grading-status.yaml
├── Student Individual Files/{Student}.pdf
└── {completion_spreadsheet} (updated)

Prerequisites

bash
pip install PyMuPDF pandas openpyxl pyyaml filelock