Querying Session Data
Use ${CLAUDE_PLUGIN_ROOT}/bin/cc-query to analyze Claude Code sessions with SQL (DuckDB).
Query Planning (Read This First)
Each bash call costs ~1-2 seconds (Node.js startup + DuckDB init + JSONL parsing). A session with 20 separate queries wastes 20-40 seconds on overhead alone.
Before running any query:
- •List all questions you need answered
- •Combine them into 1-2 heredoc batches
- •Store results mentally - never re-run the same query
Standard pattern:
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
-- Query 1
SELECT ...;
-- Query 2
SELECT ...;
-- Query 3
SELECT ...;
EOF
Quick Start
${CLAUDE_PLUGIN_ROOT}/bin/cc-query # All projects
${CLAUDE_PLUGIN_ROOT}/bin/cc-query . # Current project
${CLAUDE_PLUGIN_ROOT}/bin/cc-query ~/code/myproject # Specific project
IMPORTANT: Always use heredoc (not echo). This enables batching:
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
SELECT count(*) FROM messages;
EOF
Use Cases
Analyzing a specific session: See session-analysis.md for templates and output format.
Reference
Schema details: Run echo '.schema' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query if you need the full schema.
JSON queries: See json-queries.md for working with the message JSON field.
Advanced patterns: See advanced-queries.md for complex analysis query ideas.
Views
Base Views
| View | Description | Has message JSON |
|---|---|---|
messages | All messages (user, assistant, system) | ✓ |
user_messages | User messages with tool results, todos | ✓ |
assistant_messages | Assistant messages with API response data | ✓ |
system_messages | System messages (hooks, retries, tool output) | ✗ (has content) |
human_messages | Only human-typed messages (no tool results) | ✗ (has content) |
raw_messages | Raw JSON (uuid + full JSON string) | ✗ (has raw) |
Convenience Views (pre-extracted, no JSON needed)
| View | Key Fields |
|---|---|
tool_uses | tool_name, tool_id, tool_input (JSON) |
tool_results | tool_use_id, is_error, result_content, duration_ms |
token_usage | input_tokens, output_tokens, cache_read_tokens, model |
bash_commands | command, description, timeout |
file_operations | tool_name, file_path, pattern |
Key Fields
Common: uuid, timestamp, sessionId, message (JSON), type, cwd, version
Derived: isAgent, agentId, project, file, rownum
User-specific: toolUseResult, sourceToolAssistantUUID, todos, isMeta
human_messages only: content (VARCHAR) - extracted text, not JSON
Example Queries
Recent human messages (what did we work on)
SELECT timestamp, left(content, 100) as message FROM human_messages ORDER BY timestamp DESC LIMIT 10;
Recent sessions
SELECT sessionId, min(timestamp) as started, max(timestamp) as ended, count(*) as msgs FROM messages GROUP BY sessionId ORDER BY started DESC LIMIT 5;
Message counts
SELECT type, count(*) as cnt FROM messages GROUP BY type ORDER BY cnt DESC;
Token usage
-- Using convenience view (recommended)
SELECT sum(input_tokens) as input, sum(output_tokens) as output, sum(cache_read_tokens) as cached
FROM token_usage;
-- Or with raw JSON
SELECT sum(CAST(message->'usage'->>'input_tokens' AS BIGINT)) as input,
sum(CAST(message->'usage'->>'output_tokens' AS BIGINT)) as output,
sum(CAST(message->'usage'->>'cache_read_input_tokens' AS BIGINT)) as cache_hits
FROM assistant_messages;
Most used tools
-- Using convenience view (recommended) SELECT tool_name, count(*) as uses FROM tool_uses GROUP BY tool_name ORDER BY uses DESC LIMIT 10; -- Or with raw JSON (json_extract_string required for mixed message types) SELECT json_extract_string(message, '$.content[0].name') as tool, count(*) as uses FROM messages WHERE type = 'assistant' AND json_extract_string(message, '$.content[0].type') = 'tool_use' GROUP BY tool ORDER BY uses DESC LIMIT 10;
Agent vs main session
SELECT isAgent, count(*) as messages, count(DISTINCT agentId) as agents FROM messages GROUP BY isAgent;
Project comparison
SELECT project, count(*) as messages FROM messages GROUP BY project ORDER BY messages DESC;
Tool durations
-- Using convenience view (recommended)
SELECT tool_use_id, duration_ms, left(result_content, 50) as preview
FROM tool_results WHERE duration_ms IS NOT NULL ORDER BY duration_ms DESC LIMIT 10;
-- Or with raw JSON
SELECT CAST(toolUseResult->>'durationMs' AS INTEGER) as duration_ms,
left(message->'content'->0->>'content', 50) as result_preview
FROM user_messages WHERE toolUseResult IS NOT NULL ORDER BY duration_ms DESC LIMIT 10;
Critical Notes
Use convenience views for tool analysis: The tool_uses, tool_results, bash_commands, file_operations, and token_usage views pre-extract common fields so you don't need JSON functions:
SELECT tool_name, count(*) FROM tool_uses GROUP BY tool_name; -- Simple!
Column names are camelCase: Use sessionId, agentId, toolUseResult (NOT session_id, agent_id)
Arrow operators fail on mixed data: When using base views (assistant_messages, messages), arrow operators like message->'content'->0->>'name' cause "Conversion Error" because rows have different JSON structures. Use json_extract_string() or convenience views:
-- FAILS: message->'content'->0->>'name' -- WORKS: json_extract_string(message, '$.content[0].name') -- BEST: SELECT tool_name FROM tool_uses
JSON Access
- •
json_extract_string(message, '$.field')- preferred, handles mixed types - •
message->'field'returns JSON (use only when all rows have same structure) - •
message->>'field'returns string (same caveat) - •
message->'content'->0array access (0-indexed) - •
UNNEST(CAST(json_array AS JSON[]))expands arrays - •Precedence gotcha: Use parentheses with
IS NULL:(message->'usage') IS NOT NULL
Efficiency Guidelines
Target: 1-2 tool calls per analysis. Each bash call = ~1-2 seconds overhead.
| Calls | Overhead | Verdict |
|---|---|---|
| 1-2 | 2-4s | Good |
| 5-10 | 10-20s | Wasteful |
| 20+ | 40s+ | Unacceptable |
Two-Call Pattern for Unknown Sessions
Call 1: Discovery - Find what you're looking for:
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
-- Recent sessions
SELECT sessionId, project, min(timestamp) as started, max(timestamp) as ended, count(*) as msgs
FROM messages GROUP BY sessionId, project ORDER BY started DESC LIMIT 10;
EOF
Call 2: Deep Analysis - Query the specific session(s):
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
-- Replace SESSION_ID with ID from Call 1
SELECT count(*) as msgs, count(DISTINCT agentId) as agents FROM messages WHERE sessionId = 'SESSION_ID';
SELECT tool_name, count(*) FROM tool_uses WHERE sessionId = 'SESSION_ID' GROUP BY tool_name ORDER BY count(*) DESC;
SELECT timestamp, left(content, 300) FROM human_messages WHERE sessionId = 'SESSION_ID' ORDER BY timestamp;
EOF
Combine keyword searches with OR
-- BAD: 5 separate queries for each keyword -- GOOD: One query with all patterns SELECT timestamp, left(content, 300) FROM human_messages WHERE lower(content) LIKE '%fix%' OR lower(content) LIKE '%wrong%' OR lower(content) LIKE '%instead%' OR lower(content) LIKE '%prefer%' OR lower(content) LIKE '%should%' ORDER BY timestamp DESC;
Use CTEs for repeated filters
WITH clean_messages AS (
SELECT * FROM human_messages
WHERE content NOT LIKE '%<local-command%'
AND content NOT LIKE '%<command-name%'
AND length(content) > 20
)
SELECT timestamp, content FROM clean_messages
WHERE lower(content) LIKE '%error%';
Never Re-Query
Once you have a result, remember it. Don't run the same query again to "verify" or "check".
Common Analysis Patterns
Cross-Project Overview (Single Call)
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
-- Project activity
SELECT project, count(*) as msgs, count(DISTINCT sessionId) as sessions,
max(timestamp) as last_activity FROM messages GROUP BY project ORDER BY last_activity DESC LIMIT 10;
-- Tool patterns across all projects
SELECT tool_name, count(*) as uses FROM tool_uses GROUP BY tool_name ORDER BY uses DESC LIMIT 15;
-- Error-prone commands
SELECT left(command, 80) as cmd, count(*) as errors
FROM bash_commands bc JOIN tool_results tr ON bc.tool_id = tr.tool_use_id
WHERE tr.is_error GROUP BY cmd ORDER BY errors DESC LIMIT 10;
EOF
Find user corrections/preferences (for CLAUDE.md insights)
SELECT timestamp, left(content, 400) as content
FROM human_messages
WHERE content NOT LIKE '%<local-command%'
AND content NOT LIKE '%<command-name%'
AND length(content) > 20
AND (
lower(content) LIKE '%instead%'
OR lower(content) LIKE '%wrong%'
OR lower(content) LIKE '%fix%'
OR lower(content) LIKE '%prefer%'
OR lower(content) LIKE '%should%'
OR lower(content) LIKE '%don''t%'
OR lower(content) LIKE '%please%'
)
ORDER BY timestamp DESC;
Weekly Summary (Single Call)
cat << 'EOF' | ${CLAUDE_PLUGIN_ROOT}/bin/cc-query
-- Sessions this week
SELECT date_trunc('day', timestamp) as day, count(DISTINCT sessionId) as sessions, count(*) as msgs
FROM messages WHERE timestamp > now() - INTERVAL '7 days' GROUP BY day ORDER BY day;
-- What we worked on
SELECT left(content, 200) as topic FROM human_messages
WHERE timestamp > now() - INTERVAL '7 days' AND length(content) > 30
AND content NOT LIKE '%<local-command%' ORDER BY timestamp DESC LIMIT 20;
-- Token consumption
SELECT sum(input_tokens) as input, sum(output_tokens) as output FROM token_usage WHERE timestamp > now() - INTERVAL '7 days';
EOF