Framework Audit Skill

Comprehensive governance audit for the academicOps framework.

NO RATIONALIZATION: An audit reports ALL discrepancies. Do NOT justify ignoring files as "generated", "acceptable", or "probably don't need to be tracked". Every gap is reported. The user decides what's acceptable - not the auditor.

Workflow Entry Point

IMMEDIATELY call TodoWrite with the following items, then work through each one:

code

TodoWrite(todos=[
  {content: "Phase 0: Run health metrics script", status: "pending", activeForm: "Running health audit"},
  {content: "Phase 1: Structure audit - compare filesystem to INDEX.md", status: "pending", activeForm: "Auditing structure"},
  {content: "Phase 2: Reference graph - invoke Skill(skill='framework') then run link audit scripts", status: "pending", activeForm: "Building reference graph"},
  {content: "Phase 3: Skill content audit - check size and actionability", status: "pending", activeForm: "Auditing skill content"},
  {content: "Phase 4: Justification audit - check specs for file references", status: "pending", activeForm: "Auditing file justifications"},
  {content: "Phase 4b: Instruction justification - verify every instruction traces to enforcement-map.md", status: "pending", activeForm: "Auditing instruction justifications"},
  {content: "Phase 5: Documentation accuracy - verify README.md flowchart vs hooks", status: "pending", activeForm": "Verifying documentation"},
  {content: "Phase 6: Regenerate indices - invoke Skill(skill='flowchart') for README.md flowchart", status: "pending", activeForm: "Regenerating indices"},
  {content: "Phase 7: Other updates", status: "pending", activeForm: "Finalizing updates"},
  {content: "Phase 8: Save audit report to $ACA_DATA/projects/aops/audit/YYYY-MM-DD-HHMMSS-audit.md", status: "pending", activeForm: "Persisting report"},
  {content: "Phase 9: Create tasks for actionable findings", status: "pending", activeForm: "Creating tasks"}
])

CRITICAL: Work through EACH phase in sequence. When a phase requires a skill, invoke it explicitly as shown below.

Specialized Workflows

Session Effectiveness Audit

Qualitative assessment of session transcripts to evaluate framework performance.

code

Skill(skill="audit", args="session-effectiveness /path/to/transcript.md")

Workflow defined in workflows/session-effectiveness.md.

Individual Scripts (Reference Only)

These scripts run individual checks. They are NOT a substitute for the full workflow:

bash

uv run python scripts/audit_framework_health.py -m  # Phase 0 only
uv run python scripts/check_skill_line_count.py
uv run python scripts/check_broken_wikilinks.py
uv run python scripts/check_orphan_files.py

Phase Instructions

Phase 0: Health Metrics

Run comprehensive health audit first:

bash

cd $AOPS && uv run python scripts/audit_framework_health.py \
  --output /tmp/health-$(date +%Y%m%d).json

This generates:

•/tmp/health-YYYYMMDD.json - Machine-readable metrics
•/tmp/health-YYYYMMDD.md - Human-readable report

Metrics tracked: Component counts, hook coverage, skill sizes, wikilink validation

→ Continue to Phase 1 (do not stop here)

Phase 1: Structure Audit

Compare filesystem to documentation:

•Scan filesystem: find $AOPS -type f -not -path "*/.git/*" -not -path "*/__pycache__/*" | sort
•Compare to INDEX.md: Flag missing or extra entries
•Check cross-references: Verify → references point to existing files
•Find broken wikilinks: Grep for [[...]] patterns, validate targets exist

Phase 2: Reference Graph & Link Audit

First: Invoke Skill(skill="framework") to load framework conventions for linking rules.

Then build reference graph and check linking conventions:

bash

cd $AOPS

# Generate graph
uv run python skills/audit/scripts/build_reference_map.py

# Find orphans and violations
uv run python skills/audit/scripts/find_orphans.py

# Or use the health script for wikilink/orphan checks
uv run python scripts/check_broken_wikilinks.py
uv run python scripts/check_orphan_files.py

Linking rules to enforce (from framework skill):

•Skills via invocation (Skill(skill="x")), not file paths
•No backward links (children → parent)
•Parents must reference children
•Use wikilinks, not backticks for graph connectivity
•Full relative paths in wikilinks

Phase 3: Skill Content Audit

For each $AOPS/skills/*/SKILL.md:

•Size check: Must be <500 lines
•Actionability test: Each section must tell agents WHAT TO DO
•
Content separation violations:
- •❌ Multi-paragraph "why" → move to spec
- •❌ Historical context → delete
- •❌ Reference material >20 lines → move to references/

Phase 4: Justification Audit (Files)

For each significant file in $AOPS/:

•Search specs: Grep $AOPS/specs/ for references
•Check core docs: JIT-INJECTION.md, README.md, INDEX.md
•Classify: Justified / Implicit / Orphan

Skip: __pycache__/, .git/, individual files within skills, tests, assets

Phase 4b: Instruction Justification Audit

Every behavioral instruction injected to agents must trace to enforcement-map.md.

Unjustified instructions are bloat - they cost tokens and create confusion about what's actually enforced.

Sources to scan (files injected at SessionStart or via hooks):

•FRAMEWORK-PATHS.md - core instructions
•AXIOMS.md, HEURISTICS.md - principle statements
•skills/*/SKILL.md - skill-specific instructions
•commands/*.md - command instructions
•agents/*.md - agent instructions

What constitutes a "behavioral instruction":

•Imperative statements: "always do X", "never do Y", "you MUST", "you SHOULD"
•Conditional rules: "when X, do Y", "if X then Y"
•Workflow requirements: "invoke skill X first", "before doing X, check Y"

Validation process:

•Extract behavioral instructions from each source file (look for imperatives, MUSTs, SHOULDs, "always", "never", "before", "first")
•
For each instruction, search enforcement-map.md for:
- •Direct reference to the instruction text
- •Reference to the source file + line number
- •Mapping to an axiom or heuristic that covers this instruction
•
Classify each instruction:
- •Justified: Appears in enforcement-map.md with axiom/heuristic mapping
- •Implicit: Derives from a documented axiom/heuristic but not explicitly in enforcement-map.md
- •Orphan: No traceability - FLAG FOR REVIEW

Example orphan (discovered in session):

code

FRAMEWORK-PATHS.md:35 - "When working with session logs, always invoke Skill(skill='transcript') first"
→ NOT in enforcement-map.md
→ No axiom/heuristic reference
→ ORPHAN - needs justification or removal

Output format:

code

### Instruction Justification Status

**Justified** (N instructions):
- FRAMEWORK-PATHS.md:78 "NEVER hardcode paths" → [[axioms/dry-modular-explicit.md]]

**Implicit** (N instructions):
- skills/python-dev/SKILL.md:42 "use uv run" → derives from [[axioms/use-standard-tools.md]]

**Orphan** (N instructions) - REQUIRES ACTION:
- FRAMEWORK-PATHS.md:35 "invoke transcript skill first for session logs" → NO JUSTIFICATION
- commands/learn.md:56 "..." → NO JUSTIFICATION

Resolution for orphans:

•Create heuristic if rule is valuable
•Add to enforcement-map.md with axiom/heuristic mapping
•Or DELETE the instruction if it's not worth formalizing

Phase 5: Documentation Accuracy

Verify README.md flowchart reflects actual hook architecture:

•Parse Mermaid for hook names
•Compare to hooks/router.py dispatch mappings
•Compare to settings.json hook events
•Flag drift

Phase 6: Regenerate Generated Indices

Generated indices are root-level files for agent consumption (INDEX.md, enforcement-map.md, WORKFLOWS.md, SKILLS.md, AXIOMS.md, HEURISTICS.md, docs/ENFORCEMENT.md). The core loop flowchart is maintained in README.md.

Regenerate each deterministically from sources:

AXIOMS.md and HEURISTICS.md

bash

cd $AOPS && uv run python scripts/generate_principle_indices.py

Reads axioms/ and heuristics/ folders, generates machine-readable indices sorted by priority (1-100 bands: 1-20 core, 21-40 behavioral, 41-60 domain, 61-80 derived, 81-100 experimental).

SKILLS.md

bash

cd $AOPS && uv run python scripts/generate_skills_index.py

Scans aops-core/ and aops-tools/ for skills and commands, extracts frontmatter (name, description), and generates routing index. Preserves existing triggers from previous SKILLS.md. Reports components missing triggers.

INDEX.md

•Scan $AOPS/ directory structure
•Extract file purposes from frontmatter/headers
•Output annotated file tree

enforcement-map.md

Hook→Axiom Declaration Convention:

Every hook that enforces an axiom MUST declare it in its module docstring:

python

"""
Hook description.

Enforces: current-state-machine (Current State Machine)
"""

Multiple axioms: Enforces: fail-fast-code, trust-version-control (Fail-Fast, Trust Version Control)

Derivation sources:

•hooks/*.py docstrings - parse "Enforces:" lines for axiom mappings
•config/claude/settings.json - deny rules map to axioms via comments
•.pre-commit-config.yaml - commit-time checks

Cross-reference validation:

•Parse all hooks for "Enforces:" declarations
•Compare against enforcement-map.md Axiom→Enforcement table
•
Flag discrepancies:
- •Hook declares axiom but enforcement-map.md shows "Prompt" level only
- •enforcement-map.md lists hook but hook lacks "Enforces:" declaration
- •Axiom has Hard/Soft Gate in enforcement-map.md but no hook declares it

Output: Table mapping each axiom to its enforcement mechanism, hook, trigger point, and level.

WORKFLOWS.md

Derive task routing from:

•skills/*/SKILL.md frontmatter - what task types each skill handles
•skills/*/workflows/*.md - skill-specific workflow files (e.g., skills/framework/workflows/05-feature-development.md)
•agents/*.md - what workflows each agent uses
•enforcement-map.md (Soft Gate Guardrails section) - type→guardrail mappings

Skill workflow discovery: Scan all skills/*/workflows/*.md files and extract:

•Workflow name from filename (e.g., 05-feature-development.md → feature-development)
•Purpose from first heading or frontmatter description
•Parent skill from directory path (e.g., skills/framework/workflows/ → framework skill)

Include in output:

•A "Skill Workflows" section listing all discovered workflow files
•Path reference for each (e.g., [[skills/framework/workflows/05-feature-development]])
•Brief description extracted from file

Output: Table of task types, when to use each, workflow, and skill. Plus skill workflow index.

README.md (Core Loop Flowchart)

First: Invoke Skill(skill="flowchart") to load Mermaid diagram conventions.

Regenerate the core loop flowchart section in README.md from hook architecture sources:

•Parse hooks/router.py for dispatch mappings (event→handler)
•Parse config/claude/settings.json for hook event registrations
•Parse hooks/*.py for hook implementations and "Enforces:" declarations
•
Generate Mermaid flowchart following flowchart skill conventions:
- •Use decision diamonds for conditional logic
- •Apply classDef for semantic coloring (hooks, skills, tools, outcomes)
- •Group by execution phase (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop)
- •Label edges with event types and conditions

Structure:

•Vertical main flow showing session lifecycle
•Horizontal insertion points for each hook event
•Subgraphs for phase groupings

Location: Replace the "Core Loop" section in README.md (after the Quick Start section)

docs/ENFORCEMENT.md

Derive practical enforcement guide from:

•docs/ENFORCEMENT.md (existing mechanism ladder content - preserve)
•specs/enforcement.md "Component Responsibilities" section - root cause model

Root Cause Analysis section (append to end):

•Copy root cause definition from spec
•Copy responsibility tables (Pre/Execution/Post phases)
•Copy root cause categories
•Add failure→responsibility mapping examples from observed patterns

Each generated index must include header:

code

> **Generated by audit skill** - Do not edit manually.

Phase 7: Other Updates

•Fix README.md: Update tables including sub-workflows (see below)
•Report orphans: Flag for human review (do NOT auto-delete)
•Report violations: List with file:line refs

Sub-Workflow Extraction for README.md

Skills with multiple workflows/modes MUST have each sub-workflow documented separately in the Skills table.

Detection: For each skills/*/SKILL.md:

•Grep for ^## Workflow: or ^## Mode headers
•Check for workflows/ subdirectory with separate workflow files
•Check for ## Modes section listing multiple invocation patterns

Output format (add third column to Skills table):

markdown

| Skill            | Purpose                     | Sub-workflows                                |
| ---------------- | --------------------------- | -------------------------------------------- |
| session-insights | Session transcript analysis | Current (default), Batch, Issues             |
| audit            | Framework governance        | Full audit (default), Session effectiveness  |
| tasks            | Task lifecycle              | View/archive/create (default), Email capture |

Rules:

•Mark default workflow with "(default)"
•List workflows in order they appear in SKILL.md
•If only one workflow exists, leave sub-workflows column as "—"
•Extract workflow names from ## Workflow: X headers or workflows/*.md filenames

Phase 8: Persist Report (MANDATORY)

Every audit MUST save a written report to $ACA_DATA/projects/aops/audit/.

bash

# Create directory if needed
mkdir -p "$ACA_DATA/projects/aops/audit"

# Generate timestamped filename (format: YYYY-MM-DD-HHMMSS-audit.md)
REPORT_PATH="$ACA_DATA/projects/aops/audit/$(date +%Y-%m-%d-%H%M%S)-audit.md"

Use the Write tool to save the complete audit report (see Report Format below) to $REPORT_PATH.

Report file MUST include:

•YAML frontmatter with date, duration, and summary stats
•All phase results from Phase 0-7
•Clear pass/fail status for each validation criterion

After writing, confirm: Audit report saved to: [path]

Phase 9: Create Tasks for Actionable Findings

Create tasks for findings that require human action.

For each finding from Phases 0-7 that requires action:

•Classify finding type using the mapping below
•Create task with appropriate metadata via tasks MCP
•Track task IDs for summary

Finding Type → Issue Mapping

Finding Type	Priority	Issue Type	Labels
Broken wikilinks	P2	bug	audit,documentation
Orphan files	P3	chore	audit,cleanup
Skill >500 lines	P2	chore	audit,refactor
Explanatory content in skill	P2	chore	audit,refactor
Missing from INDEX.md	P3	chore	audit,documentation
Orphan instruction (no enforcement-map.md trace)	P2	bug	audit,governance
README.md flowchart drift	P2	bug	audit,documentation
Hook→Axiom mismatch	P2	bug	audit,governance

Task Creation Pattern

python

mcp__plugin_aops-core_tasks__create_task(
    title="[Finding Type]: [specific details]",
    type="task",
    priority=[2|3],
    tags=["audit", "[category]"],
    body="[context from audit]"
)

Skip Conditions

Do NOT create issues for:

•Regenerated indices (Phase 6 actions) - already handled
•Pass status findings - no action needed
•Justified files (Phase 4) - no action needed
•Implicit files (Phase 4) - acceptable, no action needed

Output Summary

After creating tasks, add to audit report:

markdown

### Tasks Created

Created N tasks:

- ns-xxx: Broken wikilink: [[foo.md]] in bar.md
- ns-yyy: Orphan file: docs/old.md
- ns-zzz: Skill over limit: skills/big/SKILL.md

Report Format

See [[references/report-format]] for the complete report template and validation criteria.