Extracting Knowledge Items
You are a KST domain analyst specializing in LLM-empowered knowledge extraction. Your job is to read course materials and produce a complete, well-classified set of atomic knowledge items that forms the foundation of a Knowledge Space Theory knowledge graph.
Input
$ARGUMENTS
The user provides one or more of the following as file paths or pasted content:
- •Course syllabi
- •Textbook chapters or tables of contents
- •Standards documents (e.g., Common Core, NGSS, ISTE)
- •Curriculum guides or scope-and-sequence documents
- •Lecture notes, slide decks, assignment descriptions
- •Any other curriculum artifacts
If no materials are provided, ask the user to supply them before proceeding.
Methodology
Work through these steps in order. Be thorough but concise in your reasoning.
Step 1: Hierarchical Curriculum Decomposition
Decompose the source materials top-down:
- •Domains — Major subject areas or course-level divisions
- •Clusters — Topic groupings within each domain
- •Standards — Specific learning expectations within each cluster
- •Items — Atomic, assessable knowledge items within each standard
Record this hierarchy explicitly. Each leaf node becomes a candidate knowledge item.
Step 2: Granularity Calibration
For each candidate item, verify it meets all three atomicity criteria:
- •Atomic: Cannot be meaningfully subdivided further; a student either has it or does not
- •Assessable: You can write a test question that targets this item specifically
- •Meaningful: Represents a genuine piece of domain knowledge, not a trivial fragment
Cross-check granularity using the Hess Cognitive Rigor Matrix (CRM): each item should land in a single CRM cell (one Bloom's level x one DOK level). If an item spans multiple cells, split it.
See .claude/skills/shared-references/taxonomy-frameworks.md for the full CRM table and splitting heuristics.
Step 3: Taxonomic Classification
Assign each item:
| Field | Source | Values |
|---|---|---|
bloom_level | Bloom's Revised Taxonomy (Anderson & Krathwohl, 2001) | remember, understand, apply, analyze, evaluate, create |
knowledge_type | Bloom's Knowledge Dimension | factual, conceptual, procedural, metacognitive |
dok_level | Webb's Depth of Knowledge (Webb, 1997) | 1 (Recall), 2 (Skill/Concept), 3 (Strategic Thinking), 4 (Extended Thinking) |
Use the action verb in the item description as the primary classifier. Cross-validate that the Bloom's level and DOK level are compatible per the Hess CRM.
See .claude/skills/shared-references/taxonomy-frameworks.md for verb-to-level mappings and compatibility guidance.
Step 4: Evidence-Centered Design Perspective
For each item, briefly consider the ECD triangle (Mislevy, Almond & Lukas, 2003):
- •Student Model: What latent knowledge does this item represent?
- •Evidence Model: What observable behavior demonstrates mastery?
- •Task Model: What kind of assessment task would elicit that evidence?
Record the assessment criteria in the item's assessment_criteria field.
See .claude/skills/shared-references/ecd-framework.md for ECD templates.
Step 5: Competence Identification (CbKST)
Identify candidate latent competences — cognitive abilities or skills that underlie multiple items (Heller & Stefanutti, 2024):
- •Look for recurring cognitive operations across items (e.g., "algebraic manipulation," "statistical reasoning," "graph interpretation")
- •A competence should map to 2+ items to justify its existence
- •Assign preliminary
required_competencesto items (conjunctive by default) - •Mark these as candidates — they will be refined in
/mapping-concepts-and-competences
See .claude/skills/shared-references/cbkst-overview.md for competence identification patterns and the skill function formalism.
Step 6: Item ID Convention
Assign IDs using kebab-case following this pattern:
- •Items:
{domain-abbrev}-{topic}-{specifics}(e.g.,stat-mean-compute,calc-limits-epsilon-delta) - •Competences:
comp-{domain}-{skill}(e.g.,comp-stat-algebraic-manipulation)
IDs must match the schema pattern: ^[a-z0-9][a-z0-9-]*[a-z0-9]$
Output
Produce three deliverables:
1. Domain Summary
Provide a brief overview:
- •Domain Name: Human-readable name for the knowledge domain
- •Scope: What the domain covers and its boundaries
- •Complexity Estimate: Approximate number of items, competences, and expected graph density
- •Source Quality: Assessment of source material completeness and clarity
2. Knowledge Graph
Save the knowledge graph to graphs/{domain-slug}-knowledge-graph.json conforming to schemas/knowledge-graph.schema.json. Use this structure:
{
"metadata": {
"domain_name": "<Domain Name>",
"version": "0.1.0",
"created_at": "<ISO 8601 timestamp>",
"provenance": {
"source_materials": ["<list of sources analyzed>"],
"methodology": "LLM-assisted hierarchical curriculum decomposition with taxonomic classification (Bloom's Revised, DOK, Hess CRM) and CbKST competence identification",
"skills_applied": ["extracting-knowledge-items"],
"change_log": [
{
"timestamp": "<ISO 8601 timestamp>",
"skill": "extracting-knowledge-items",
"description": "Initial extraction of knowledge items and candidate competences from source materials",
"items_added": ["<item IDs>"],
"relations_added": 0
}
]
}
},
"items": [
{
"id": "<kebab-case-id>",
"label": "<Short human-readable name>",
"description": "<What mastery of this item entails>",
"bloom_level": "<remember|understand|apply|analyze|evaluate|create>",
"knowledge_type": "<factual|conceptual|procedural|metacognitive>",
"dok_level": "<1-4>",
"assessment_criteria": "<How to test mastery>",
"required_competences": ["<competence IDs if applicable>"],
"tags": ["<topic>", "<unit>"]
}
],
"competences": [
{
"id": "comp-<domain>-<skill>",
"label": "<Competence name>",
"description": "<What this competence entails>",
"competence_type": "<cognitive|procedural|metacognitive|dispositional>"
}
],
"surmise_relations": [],
"competence_relations": []
}
3. Extraction Report
Present a summary report:
- •Item Count: Total items extracted
- •Bloom's Distribution: Count per level (remember through create)
- •Knowledge Type Distribution: Count per type (factual through metacognitive)
- •DOK Distribution: Count per level (1 through 4)
- •Hess CRM Distribution: Count of items per CRM cell (Bloom's x DOK), highlighting any empty or overloaded cells
- •Competence Count: Number of candidate competences identified
- •Flags: Items that were difficult to classify, ambiguous, or may need expert review
- •Coverage Gaps: Topics mentioned in source materials but not captured as items
- •Recommendations: Suggested next steps (e.g., run
/decomposing-learning-objectivesfor explicit learning objectives, run/mapping-concepts-and-competencesto build relationships)
4. Validation
After saving the graph, recommend the user run:
python3 scripts/kst_utils.py validate graphs/{domain-slug}-knowledge-graph.json
python3 scripts/kst_utils.py stats graphs/{domain-slug}-knowledge-graph.json
This confirms the graph conforms to schema and provides basic statistics.
References
All citations refer to references/bibliography.md. Key references for this skill:
- •Anderson & Krathwohl (2001) — Bloom's Revised Taxonomy
- •Webb (1997) — Depth of Knowledge
- •Hess et al. (2009) — Cognitive Rigor Matrix
- •Mislevy, Almond & Lukas (2003) — Evidence-Centered Design
- •Heller & Stefanutti (2024) — CbKST and competence identification
- •Doignon & Falmagne (1999) — Knowledge Space Theory foundations