Diagram Content Analysis

The quality of a diagram is determined before any visual decisions are made. If you identify the wrong dimensions, tell the wrong story, or write unclear content, no amount of visual encoding or graphic design can fix it. This skill is about the editorial judgment that separates a diagram that illuminates from one that merely decorates.

When This Skill Runs

This skill produces the content specification — the "what to show" — that feeds into the visual encoding skill (which decides how to show it) and eventually the graphic design skill (which makes it look right). Think of it as the editorial layer: a good editor decides what story to tell and what to cut before the designer touches it.

Inputs

This skill works with three kinds of source material:

•
Documents (most common) — book chapters, papers, reports, technical specs. The challenge is compression: a 5,000-word chapter becomes ~200 words of diagram text plus visual encodings. The skill guides that compression.
•
Datasets — CSV, tables, metrics. The challenge is finding the story in the numbers: what pattern or comparison matters most?
•
Concepts described in conversation — the user explains a system, process, or idea. The challenge is extracting structure from informal description.

Core Workflow

Phase A: Understand the Source

Read the source material thoroughly before extracting anything. The goal is to understand it well enough to explain it to someone else — not just to identify keywords.

For documents, look for:

•The author's thesis or central argument
•The logical structure (how concepts build on each other)
•Which concepts are foundational vs. derived
•Where the author spends the most space (that's usually what matters most)
•What the author assumes the reader already knows

For datasets, look for:

•What dimensions exist (columns, categories, time series)
•What varies and what's constant
•Where the interesting patterns are (outliers, trends, clusters, gaps)
•What comparison the viewer would find most useful

For conversational descriptions, look for:

•What the user emphasizes or repeats
•What they seem to find most important or surprising
•Implicit structure they haven't articulated (sequences, hierarchies, trade-offs)
•Gaps in the description that need filling

Adaptive approach: If the user's intent is clear from context — they've described the audience, purpose, and key message — proceed directly to extraction. If the intent is ambiguous, ask targeted questions before analyzing:

•Who will look at this diagram? (audience expertise level)
•What should they do or understand differently after seeing it? (purpose)
•Is there a specific message or argument, or is this more exploratory? (story)

Keep the interview short — 2-3 questions maximum. The goal is to remove ambiguity, not to conduct a requirements gathering session.

Phase B: Find the Story

Every effective diagram has a story — a throughline that organizes everything else. The story is the answer to "so what?" It's the reason the diagram exists.

Three types of diagram stories:

Explanatory — "Here's how this works." The story is a mental model the viewer doesn't yet have. The diagram teaches a concept, process, or system. Success: the viewer can explain the concept to someone else after seeing the diagram.

•Common for: book chapters, educational materials, onboarding docs, technical architecture
•The challenge: deciding what level of detail serves understanding vs. overwhelms it

Analytical — "Here's what the data shows." The story is a pattern, trend, or comparison the viewer should notice. The diagram makes a pattern visible that was hidden in the raw numbers. Success: the viewer sees the pattern immediately.

•Common for: dashboards, reports, data presentations, research findings
•The challenge: not just showing data but encoding the interesting comparison

Persuasive — "Here's why this matters." The story makes an argument. The diagram is evidence in service of a conclusion. Success: the viewer is more convinced of the argument after seeing the diagram.

•Common for: proposals, pitch decks, policy briefs, executive summaries
•The challenge: being honest — the visual encoding should illuminate, not mislead

Write the story as a single sentence. If you can't compress it to one sentence, you haven't found the story yet — you've found the topic. "Data pipelines" is a topic. "Data volume drops 70% between ingestion and serving because most raw events are noise" is a story.

Phase C: Extract and Classify Dimensions

List every dimension in the source material. A dimension is anything that varies and could potentially be shown. Classify each:

Quantitative — has magnitude, can be compared numerically. Volume, count, rate, cost, duration, ratio, percentage. These can be encoded with position, length, area, or color intensity.

Categorical — distinct groups, no inherent order. Type, role, category, department, platform. These can be encoded with spatial grouping, color hue, or shape.

Relational — connections between things. Flow (A feeds B), hierarchy (A contains B), dependency (A requires B), association (A relates to B). These can be encoded with lines, containment, or proximity.

Ordinal — categories with meaningful order. Stages, phases, priority levels, maturity levels. These can be encoded with spatial position or color gradients.

Conceptual — abstract ideas that need definition and examples. Frameworks, principles, mental models, taxonomies. These can't be "encoded" in the visual-channel sense — they need text, and the diagram's job is to organize and relate them spatially. This is the dimension type that the existing information design literature handles least well, because it doesn't map neatly to Cleveland & McGill. Conceptual diagrams succeed when the spatial arrangement itself teaches the relationships.

Be exhaustive in listing dimensions — you'll cut aggressively in the next phase. It's easier to cut from a complete list than to discover you missed something important after you've committed to a visual structure.

Phase D: Prioritize Ruthlessly

A diagram can effectively encode 2–3 dimensions visually. Trying to encode 5+ creates noise. Everything else is either annotation text or gets cut.

Assign each dimension to a tier:

Primary (1 dimension) — What the viewer grasps in the first 3 seconds. This is the story made visible. If the diagram is about flow, the primary encoding IS the flow. If it's about comparison, the primary encoding IS the comparison. The primary dimension gets the most powerful visual channel available.

Secondary (1–2 dimensions) — What the viewer notices on second look. Adds depth to the story without competing with the primary. Must be visually independent from the primary — the viewer should be able to read each encoding separately.

Tertiary (0–1 dimensions) — Rewards closer inspection. Subtle encoding (color intensity, small position differences) for viewers who spend time with the diagram. Optional — many diagrams are better with just primary and secondary.

Annotation — Important details that appear as text labels, footnotes, or callouts but aren't visually encoded. Technology names, exact numbers, dates, caveats. These are "read" not "seen."

Cut — Doesn't appear in the diagram at all. This is the hardest editorial decision and the most important. The most common diagram failure mode is trying to show everything. Every dimension you cut makes the remaining dimensions clearer.

How to decide what to cut:

•Does removing it change the story? If not, cut it.
•Is it something the viewer already knows? Cut it.
•Is it a detail that matters for implementation but not understanding? Cut it.
•Would a footnote or separate document serve it better? Move it there.
•Does it only matter to a subset of the audience? Cut it from the diagram, mention it in accompanying text.

Phase E: Write Content (for standalone/educational diagrams)

When the diagram must stand on its own — someone who hasn't read the source material should understand it — the text content needs as much care as the dimensional analysis.

Read references/content-writing.md for detailed guidance. Key principles:

Define every term of art. If the diagram uses vocabulary like "normalization," "chokepoint," or "frame," define it the first time it appears. Don't remove the jargon — teaching the vocabulary IS part of the point — but pair each term with a plain-language definition and a concrete example.

Use examples from different domains. When presenting multiple related concepts, each example should come from a distinctly different domain. If all examples come from the same domain, the concepts blur together. If you're explaining four types of cognitive bias, use one example from medicine, one from investing, one from cooking, one from sports — not four examples from investing.

Write for the card, not the page. Each card/node has limited space. Three lines for a definition-with-example is enough. Cut every word that doesn't earn its place.

Layer the content. Structure text in progressive disclosure layers:

•Term or question (bold) — what is this concept?
•Definition + example (regular) — how does it work?
•Key insight (accent) — why does it matter?
•Action levers (secondary) — what can you do with it?

The viewer can stop at any layer and still get value.

Phase F: Produce the Specification

The output has two parts: a human-readable brief for review, and a structured spec that the visual encoding skill can consume.

Human-readable brief (present this to the user for approval):

code

## Story
[One-sentence story]

## Audience
[Who, what they know, what they should take away]

## Dimensions (ranked)
- PRIMARY: [dimension] — [why this is the main thing]
- SECONDARY: [dimension] — [what depth this adds]
- ANNOTATION: [list of text-only details]
- CUT: [what was deliberately excluded and why]

## Content
[For each concept/node, the text that will appear — definition,
example, insight, levers. Written at diagram length, not document length.]

## Relationships
[How concepts/dimensions connect — flows, hierarchies, dependencies]

## Open Questions
[Anything the user should weigh in on before visual design begins]

Structured spec (append after the brief, for Skill 2):

Write the spec in YAML within a fenced code block. See references/output-format.md for the full schema. The spec must capture:

•story: type (explanatory/analytical/persuasive), primary message, audience
•dimensions: each with name, type, values/range, priority tier, and encoding hints
•relationships: typed connections between dimensions or concepts
•concepts: for educational diagrams — term, definition, example, insight, levers
•content: title, subtitle, footnotes, source attribution
•constraints: page size if known, whether it must stand alone, print vs. screen

Common Failure Modes

Trying to show everything. The most frequent failure. A book chapter has 20 concepts; the diagram tries to include all 20 and none of them are clear. The fix is always to cut more aggressively. A diagram that shows 5 things clearly is infinitely more valuable than one that shows 20 things illegibly.

Missing the story. Dimensions are identified correctly but there's no narrative throughline. The diagram is technically accurate but doesn't illuminate anything. The fix is to write the one-sentence story and let it drive every prioritization decision.

Wrong audience assumption. Too much jargon for a general audience, or too simplified for experts. The fix is to explicitly state the audience's existing knowledge and calibrate definitions and detail level to that.

Same-domain examples. Four concepts, four examples from software engineering. The viewer can't tell which example goes with which concept because they all sound similar. The fix is deliberate domain diversity.

Weak content compression. Card text is a shortened version of the source document rather than a rewrite. It reads like a summary rather than a standalone explanation. The fix is to write the card text from scratch, using the source as reference, not as a template to abbreviate.

References

•references/content-writing.md — Detailed guidance on writing diagram content: progressive disclosure, term definition, example selection, compression techniques.
•references/output-format.md — The YAML spec schema that the visual encoding skill consumes, with annotated examples.