Data Viz Insight
This skill enables interactive data exploration, visualization planning, and automated chart generation in marimo notebooks.
Prerequisites
Marimo server must be running with MCP support:
uv run marimo edit main.py --mcp --no-token
Workflow
Follow this 5-step interactive process:
Step 1: Data Input
If the user hasn't provided a data file path, ask for it:
- •"Which data file would you like to visualize?"
- •Accept: CSV, Excel (.xlsx), Parquet (.parquet), JSON files
Verify the file exists before proceeding.
Step 2: Auto-Explore Data
Option 1: Use the exploration script (recommended for comprehensive analysis)
uv run python .claude/skills/data-viz-insight/scripts/explore_data.py data.csv
The script provides:
- •Data shape and schema
- •Summary statistics for numeric columns
- •Value counts for categorical columns
- •Missing data analysis
- •Date ranges for temporal columns
- •Sample rows
- •Visualization recommendations
Option 2: Use Polars directly for custom exploration
import polars as pl
# Read data (format detected automatically)
df = pl.read_csv("data.csv") # or read_excel, read_parquet, read_json
# Gather key information
schema = df.schema # Column names and types
shape = (df.height, df.width) # Rows, columns
stats = df.describe() # Summary statistics
nulls = df.null_count() # Missing values
Present findings in a structured summary:
- •Data shape (rows × columns)
- •Column types breakdown (numeric, categorical, temporal)
- •Key statistics for numeric columns
- •Unique values for categorical columns
- •Date ranges for temporal data
- •Notable patterns or data quality issues
For detailed Polars patterns, see references/polars-patterns.md.
Step 3: Understand User Interest
After sharing initial insights, explicitly ask what aspects interest the user:
- •"What aspects of this data would you like to explore?"
- •"Are you interested in trends over time, category breakdowns, or relationships between variables?"
- •"Which specific fields or patterns caught your attention?"
Listen for specific interests like:
- •Time-based trends
- •Category comparisons
- •Distribution analysis
- •Correlation between variables
- •Metric vs metric comparisons
- •Top/bottom performers
- •Anomaly detection
Step 4: Propose Visualizations
Based on data characteristics and user interests, propose 3-5 specific charts with rationale:
Example proposal format:
Based on your data analysis, I propose these visualizations: 1. **[Metric] by [Category] (Bar Chart)** - Compare values across different groups 2. **[Metric] Over Time (Line Chart)** - Show trends and patterns 3. **[Category] Distribution (Pie Chart)** - Visualize proportions of the whole 4. **[Value] Distribution (Histogram)** - Understand the spread of values 5. **[Variable A] vs [Variable B] (Scatter)** - Explore relationships Would you like me to create these visualizations?
Adapt to data type:
- •Sales data: "Revenue by Region", "Monthly Sales Trend", "Product Mix"
- •Web analytics: "Traffic by Source", "Daily Visitors", "Bounce Rate Distribution"
- •Scientific data: "Measurements by Condition", "Temperature Over Time", "Correlation Matrix"
- •Financial data: "Spending by Category", "Transaction Trend", "Amount Distribution"
For chart type selection guidance, see references/plotly-charts.md.
Step 5: Execute & Conclude
Once approved, create visualizations in marimo notebook and write conclusions.
Adding Visualization Cells
Use marimo MCP tools to inspect the notebook:
- •
mcp__marimo__get_active_notebooks- Get session ID - •
mcp__marimo__get_lightweight_cell_map- View structure
Add cells directly to the marimo file using the Edit tool. Each chart follows this pattern:
@app.cell
def _(df, go, pl):
import plotly.graph_objects as go
# Group data for visualization
category_totals = df.group_by("category").agg(
pl.col("amount").sum().alias("total")
).sort("total", descending=True)
# Create chart using Graph Objects
fig = go.Figure(data=[
go.Bar(
x=category_totals["category"].to_list(),
y=category_totals["total"].to_list(),
marker=dict(
color=category_totals["total"].to_list(),
colorscale='Blues',
showscale=False
)
)
])
fig.update_layout(
title="Spending by Category",
xaxis_title="Category",
yaxis_title="Total Amount (TWD)",
showlegend=False
)
return fig
Cell guidelines:
- •ALWAYS use Plotly Graph Objects (
plotly.graph_objects), NOT Plotly Express - •Import
gofrom plotly.graph_objects in the imports cell - •Reference data from previous cells (e.g.,
df) - •Convert Polars columns to lists using
.to_list()before passing to plotly - •Return the figure object
- •Use descriptive titles and axis labels with
update_layout() - •One visualization per cell for reactivity
Why Graph Objects over Express:
- •No numpy dependency required
- •More control over chart customization
- •Explicit data handling with
.to_list() - •Better performance with Polars DataFrames
Writing Conclusions
Add a conclusion cell summarizing key findings:
@app.cell
def _():
import marimo as mo
mo.md("""
## Data Analysis Summary
**Key Findings:**
- [Finding 1: e.g., "Category A accounts for 45% of total (12,450 units)"]
- [Finding 2: e.g., "Peak activity occurs on [day/time] - 2.3x above average"]
- [Finding 3: e.g., "Largest value: 3,196 in [category] on [date]"]
**Insights:**
- [Pattern or trend observed from the data]
- [Notable anomaly or outlier identified]
""")
return
Keep conclusions:
- •Brief (3-5 bullet points for findings + 2-3 insights)
- •Data-driven (include specific numbers from analysis)
- •Actionable (suggest patterns or next steps when relevant)
After creating visualizations, use MCP tools to verify:
- •
mcp__marimo__get_cell_outputs- View rendered charts - •
mcp__marimo__lint_notebook- Validate notebook structure
Marimo MCP Tools Reference
When marimo runs with MCP enabled, these tools are available:
- •
mcp__marimo__get_marimo_rules- Get marimo best practices - •
mcp__marimo__get_active_notebooks- List active sessions and file paths - •
mcp__marimo__get_lightweight_cell_map- Preview notebook structure - •
mcp__marimo__get_tables_and_variables- Inspect data in session - •
mcp__marimo__get_cell_outputs- View visualization results - •
mcp__marimo__lint_notebook- Validate changes
Data Format Support
Polars supports these formats natively:
CSV:
df = pl.read_csv("data.csv")
Excel:
df = pl.read_excel("data.xlsx", sheet_name="Sheet1")
Parquet:
df = pl.read_parquet("data.parquet")
JSON:
df = pl.read_json("data.json")
Visualization Selection Guide
Quick reference for choosing chart types:
Categorical comparisons → Bar chart, horizontal bar Proportions → Pie chart (<7 categories), treemap Time series → Line chart, area chart Distributions → Histogram, box plot, violin plot Relationships → Scatter plot, bubble chart Correlations → Heatmap Multi-category → Grouped bar, stacked bar
For detailed examples and customization patterns, see references/plotly-charts.md.
Resources
This skill includes reference documentation for detailed patterns:
- •references/polars-patterns.md - Complete Polars data exploration patterns, filtering, transformations, and date operations
- •references/plotly-charts.md - Chart type examples, customization patterns, and interactive features