Skill: Compare Datasets
Purpose
Compare metrics, findings, and patterns across two or more connected datasets. Helps identify cross-dataset patterns (e.g., "conversion funnel behavior is similar across both product lines") and dataset-specific anomalies.
When to Use
- •User says
/compare-datasetsor "compare across datasets" - •After analyzing multiple datasets, to find commonalities
- •When the user asks "is this pattern unique to this dataset?"
Invocation
/compare-datasets — compare active dataset with all others
/compare-datasets {id1} {id2} — compare two specific datasets
/compare-datasets metric={name} — compare a specific metric across datasets
Instructions
Step 1: Identify Datasets to Compare
- •Read
.knowledge/datasets/to enumerate all connected datasets. - •If specific datasets are named, validate they exist.
- •If no datasets specified, use active + all others.
- •Require at least 2 datasets. If only 1 exists: "Only one dataset connected. Use
/connect-datato add another."
Step 2: Load Metric Dictionaries
For each dataset:
- •Read
.knowledge/datasets/{id}/metrics/index.yaml - •Build a union of all metric IDs across datasets
- •Identify shared metrics (same ID or same name) vs. dataset-specific metrics
Step 3: Compare Shared Metrics
For each metric that exists in 2+ datasets:
- •Load the metric YAML from each dataset
- •Compare: definition match? (same formula, same unit)
- •Compare: typical range overlap? (do the datasets have similar baselines?)
- •Compare: guardrails alignment? (are thresholds consistent?)
- •Flag discrepancies: "conversion_rate is defined differently in {dataset_a} vs {dataset_b}"
Step 4: Compare Analysis History
For each dataset:
- •Read
.knowledge/analyses/index.yaml - •Extract key findings from recent analyses
- •Look for cross-dataset patterns:
- •Same finding appearing in multiple datasets
- •Opposite findings (metric up in one, down in another)
- •Same root cause identified independently
Step 5: Generate Cross-Dataset Observations
Write findings to .knowledge/global/cross_dataset_observations.yaml:
- •Shared patterns: behaviors that appear across datasets
- •Divergences: where datasets behave differently
- •Metric alignment: which metrics are consistently defined
- •Suggested investigations: questions raised by the comparison
Step 6: Present Results
Display a comparison table:
code
Cross-Dataset Comparison: {dataset_a} vs {dataset_b}
Shared Metrics: {N} ({M} with matching definitions)
Metric Discrepancies: {list}
Shared Patterns:
- {pattern description} (seen in both datasets)
Divergences:
- {metric} is {direction} in {dataset_a} but {direction} in {dataset_b}
Suggested Next:
- "Investigate why {pattern} differs between datasets"
- "Align {metric} definitions across datasets"
Edge Cases
- •Only 1 dataset: Cannot compare — suggest connecting another
- •No shared metrics: Report this — datasets may serve different purposes
- •No analysis history: Compare schemas and metric definitions only
- •Many datasets (>5): Compare pairwise with the active dataset only