Foundation: Building Fundamental Understanding
Build deep foundational understanding of a domain. Goes beyond surface definitions to explore meaning, connectivity, and historical development of concepts.
Position in Learning Pipeline
domain-vocab → foundation → trace/frontier/paper-flow (WHAT) (WHY/HOW) (WHERE) Concepts → Understanding → Research Origins
When to Use
- •"Build foundation for [domain]"
- •"I know the terms but don't understand how they connect"
- •"Foundation [domain]"
- •"[domain] fundamentals deeply"
- •"How did [domain] develop?"
- •"What's the big picture of [domain]?"
- •After domain-vocab, before diving into papers
When NOT to Use
- •Just need term definitions → use
domain-vocab - •Want specific paper lineage → use
trace - •Want latest research → use
frontier - •Want full concept + paper treatment → use
deep-dive
Core Value
Knowing terms is vocabulary. Understanding how they connect, why they exist, and how they evolved is fluency.
Foundation transforms scattered concepts into a coherent mental map.
Workflow
Phase 1: Deep Meaning Extraction
Objective: Go beyond dictionary definitions to understand what terms really mean.
Actions:
- •
For each key concept, extract:
- •Literal meaning: What the term denotes
- •Connotative meaning: What practitioners imply when using it
- •Historical meaning: How the meaning evolved over time
- •Contextual meaning: How meaning shifts in different sub-domains
- •
Identify semantic layers:
- •Surface level (beginner understanding)
- •Working level (practitioner understanding)
- •Deep level (expert/researcher understanding)
Output:
concept: "Gradient Descent"
meanings:
literal: "Iteratively moving in the direction of steepest decrease"
connotative: "The workhorse optimization method; implies iterative refinement"
historical:
- 1847: Cauchy introduces for equation solving
- 1960s: Applied to neural networks
- 2010s: Becomes synonymous with "training"
contextual:
optimization: "A first-order method"
deep_learning: "The training algorithm"
practitioner: "Just run .backward() and step()"
semantic_layers:
surface: "Go downhill to find minimum"
working: "Compute gradients, update parameters with learning rate"
deep: "Navigating loss landscape geometry, escaping saddles"
Phase 2: Connectivity Mapping
Objective: Reveal how concepts relate, depend, and interact.
Actions:
- •
Build Prerequisite Graph:
- •What must you understand before concept X?
- •What concepts assume X as given?
- •
Build Synergy Map:
- •Which concepts amplify each other?
- •Which are frequently used together?
- •
Build Tension Map:
- •Which concepts compete or conflict?
- •What are the fundamental tradeoffs?
- •
Identify Bridge Concepts:
- •Concepts that connect different sub-domains
- •Concepts imported from other fields
Output:
connectivity:
prerequisites:
"Neural Network":
requires: ["Linear Algebra", "Calculus", "Probability"]
enables: ["Deep Learning", "Backpropagation", "Architectures"]
synergies:
- concepts: ["Attention", "Transformer"]
relationship: "Attention enables Transformer's parallelism"
- concepts: ["Dropout", "Regularization"]
relationship: "Dropout implements stochastic regularization"
tensions:
- concepts: ["Bias", "Variance"]
tradeoff: "Reducing one typically increases the other"
- concepts: ["Interpretability", "Accuracy"]
tradeoff: "Complex models harder to explain"
bridges:
- concept: "Information Theory"
connects: ["ML Theory", "Compression", "Generalization"]
imported_from: "Communication Theory (Shannon)"
Phase 3: Historical Flow
Objective: Trace the development arc of the field.
Actions:
- •
Identify Era Boundaries:
- •What paradigm shifts divided the field's history?
- •What changed "before" vs "after" key moments?
- •
Map Evolution Threads:
- •How did major ideas evolve?
- •What parallel paths merged or diverged?
- •
Extract Key Inflection Points:
- •Breakthroughs that changed everything
- •Failures that redirected the field
- •External events that accelerated progress
- •
Capture Conceptual Archaeology:
- •Ideas that died and were resurrected
- •Terms that changed meaning over time
- •Approaches that fell out of favor and why
Output:
## Historical Flow: Machine Learning ### Era Map | Era | Period | Paradigm | Key Development | |-----|--------|----------|-----------------| | Symbolic | 1950-1980 | Logic & rules | Expert systems, LISP | | Connectionist Winter | 1970-1986 | Criticism of neural nets | Minsky's Perceptrons critique | | Revival | 1986-2006 | Backpropagation | Rumelhart, Hinton, Williams | | Deep Learning | 2006-2017 | Deep architectures | GPU training, ImageNet | | Foundation Models | 2017-now | Scale + attention | Transformers, GPT, BERT | ### Inflection Points - **1986**: Backpropagation popularized → Neural networks become trainable - **2012**: AlexNet wins ImageNet → Deep learning proven at scale - **2017**: "Attention Is All You Need" → Transformer architecture - **2020**: GPT-3 → Few-shot learning without fine-tuning ### Resurrection Stories - **Neural Networks**: Dismissed 1970s → Revived 2000s - **Reinforcement Learning**: Academic curiosity → AlphaGo 2016 - **Perceptrons**: "Can't learn XOR" → Deep networks transcend this
Phase 4: Structural Understanding
Objective: Build the "big picture" mental model of the field.
Actions:
- •
Identify Core Pillars:
- •What are the 3-5 fundamental ideas the field rests on?
- •What would collapse if removed?
- •
Map Sub-domain Architecture:
- •How does the field divide into areas?
- •What are the boundaries and overlaps?
- •
Extract Governing Principles:
- •What laws, theorems, or heuristics guide the field?
- •What are the "physics" of this domain?
- •
Identify Open Questions:
- •What does the field not yet understand?
- •Where are the active frontiers?
Output:
structure:
core_pillars:
- "Optimization": "Finding parameters that minimize loss"
- "Generalization": "Performance on unseen data"
- "Representation": "Learning useful features"
- "Architecture": "Structure of the model"
sub_domains:
supervised:
includes: ["Classification", "Regression"]
key_assumption: "Labeled training data available"
unsupervised:
includes: ["Clustering", "Dimensionality Reduction", "Generative"]
key_assumption: "Find structure without labels"
reinforcement:
includes: ["Policy Learning", "Value Functions", "Exploration"]
key_assumption: "Learn from interaction with environment"
governing_principles:
- "No Free Lunch": No universally best algorithm
- "Bias-Variance Tradeoff": Fundamental tension in learning
- "Occam's Razor": Prefer simpler models
- "Universal Approximation": Neural nets can approximate any function
open_questions:
- "Why do overparameterized models generalize?"
- "How to achieve robust out-of-distribution performance?"
- "What is the right inductive bias for reasoning?"
Phase 5: Misconception Clearing
Objective: Identify and correct common misunderstandings.
Actions:
- •
Catalog Beginner Misconceptions:
- •What do newcomers typically get wrong?
- •What intuitions from other fields mislead?
- •
Identify Expert Blind Spots:
- •What do experts assume everyone knows?
- •What jargon obscures understanding?
- •
Debunk Persistent Myths:
- •What false beliefs persist despite evidence?
- •What oversimplifications are dangerous?
Output:
misconceptions:
beginner:
- myth: "More data always improves performance"
reality: "Diminishing returns; data quality matters more than quantity"
- myth: "Deep learning requires massive datasets"
reality: "Transfer learning, augmentation, and pretraining reduce data needs"
- myth: "Higher accuracy means better model"
reality: "Depends on task; calibration, fairness, robustness matter"
expert_blind_spots:
- assumption: "Everyone knows what a tensor is"
clarification: "Multidimensional array with specific mathematical properties"
- jargon: "The model is overfitting"
clarification: "Training performance >> test performance; memorizing not learning"
persistent_myths:
- myth: "Neural networks are black boxes"
reality: "Interpretability methods exist; attention, gradients, probing"
- myth: "AI will be generally intelligent soon"
reality: "Current systems are narrow; AGI timeline highly uncertain"
Output Formats
Format A: Foundation Report (Default)
# [Domain] Foundation Report
## Quick Orientation
{3-4 sentences positioning this field}
## Core Pillars
{The 3-5 fundamental ideas}
## Concept Deep Dives
{Detailed meaning exploration for key concepts}
## How It Connects
{Connectivity map: prerequisites, synergies, tensions}
## Historical Arc
{Timeline with eras and inflection points}
## Mental Models
{Frameworks for thinking about this domain}
## Common Misconceptions
{What to unlearn}
## Before You Continue
{Prerequisites checklist}
{Recommended next steps}
Format B: Visual Concept Map
Mermaid diagrams showing:
- •Prerequisite chains
- •Era timeline
- •Sub-domain relationships
Format C: Study Guide
Structured learning path with:
- •Ordered concept sequence
- •Checkpoint questions
- •Practical exercises
Example Session
Input: "Build foundation for reinforcement learning"
Output Summary:
# Reinforcement Learning Foundation ## Quick Orientation RL is learning through interaction: an agent takes actions in an environment, receives rewards, and learns to maximize cumulative reward. Unlike supervised learning (given correct answers) or unsupervised (find patterns), RL discovers behavior through trial and error. ## Core Pillars 1. **Agent-Environment Loop**: Action → State → Reward → Action 2. **Value Functions**: Predicting future cumulative reward 3. **Policy**: Mapping states to actions 4. **Exploration vs Exploitation**: Try new vs use known ## Deep Meaning: "Reward" - Literal: Scalar signal from environment - Connotative: "What we want the agent to do" - Deeper: Reward shaping is hard; sparse rewards cause credit assignment - Historical: From behaviorist psychology (Skinner, operant conditioning) ## Connectivity Prerequisites: Probability, Markov Chains, Dynamic Programming Synergies: Value + Policy → Actor-Critic Tensions: Exploration ↔ Exploitation, Sample Efficiency ↔ Asymptotic Performance ## Historical Arc - 1950s: Dynamic programming (Bellman) - 1989: Q-learning (Watkins) - 1992: TD-Gammon (Tesauro) - backgammon via self-play - 2013: DQN (Mnih) - Atari from pixels - 2016: AlphaGo - RL defeats world champion - 2020+: RLHF - RL for language model alignment ## Misconceptions ❌ "Reward must be designed perfectly" → Reward shaping, inverse RL help ❌ "RL is just trial and error" → Planning, model-based methods exist ❌ "RL needs millions of samples" → Offline RL, model-based reduce this
Integration with Other Skills
| Flow | Description |
|---|---|
| domain-vocab → foundation | Concepts identified → now understand deeply |
| foundation → trace | Understand field → trace specific concept origins |
| foundation → frontier | Understand history → see where it's heading |
| foundation → deep-dive | Optional: combine with paper genealogy |
Error Handling
| Situation | Recovery |
|---|---|
| Domain too broad | Focus on sub-domain, ask user preference |
| Insufficient historical data | Note gaps, focus on structural understanding |
| Conflicting sources | Present multiple perspectives with context |
| Highly interdisciplinary | Map connections to source fields |