AgentSkillsCN

duckdb-query

对事件数据库执行DuckDB SQL查询。在查询事件、运行数据分析、探索数据模式,或分析数据库内容时使用此功能。当用户请求查询、分析或探索事件数据时,该功能会自动调用。

SKILL.md
--- frontmatter
name: duckdb-query
description: Execute DuckDB SQL queries against the events database. Use when querying events, running analytics, exploring data patterns, or analyzing database contents. Auto-invokes when user asks to query, analyze, or explore event data.
allowed-tools: Bash, Read
argument-hint: [SQL query or analysis request]

DuckDB Query Skill

Execute SQL queries against the DuckDB events database using DuckDB's powerful analytical features.

Database Context

This project uses DuckDB to store events with the following schema:

sql
-- events table
CREATE TABLE events (
    id BIGINT PRIMARY KEY,
    event_type VARCHAR NOT NULL,
    source VARCHAR NOT NULL,
    received_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
    payload JSON NOT NULL
);

-- Indexes: idx_events_type, idx_events_received_at

Database Location

The DuckDB database file location is configured in the config file (default: events.duckdb). Check the active config to find the current database file path.

Usage

When the user provides a query or analysis request, execute it against the DuckDB database:

bash
# Direct SQL query
duckdb events.duckdb "SELECT * FROM events LIMIT 10"

# With formatting
duckdb events.duckdb -markdown "SELECT event_type, COUNT(*) as count FROM events GROUP BY event_type"

# With JSON output
duckdb events.duckdb -json "SELECT * FROM events WHERE id = 1"

Query Execution Guidelines

  1. Find the database file: Check config.yaml or .events.yaml for db.file setting
  2. Use appropriate output format:
    • -markdown for tabular results (best for displaying to user)
    • -json for structured data
    • -csv for export
    • Default is ASCII table
  3. Always use LIMIT for exploratory queries to prevent overwhelming output
  4. Quote the SQL query to handle special characters

DuckDB-Specific Features

DuckDB provides powerful SQL extensions beyond standard SQL:

JSON Querying

sql
-- Extract JSON fields from payload
SELECT
    id,
    json_extract(payload, '$.userId') as user_id,
    json_extract(payload, '$.email') as email
FROM events
WHERE event_type = 'user.created';

-- Query nested JSON structures
SELECT json_extract_string(payload, '$.user.name') FROM events;

Time Series Analysis

sql
-- Events per hour with window functions
SELECT
    date_trunc('hour', received_at) as hour,
    COUNT(*) as events,
    LAG(COUNT(*)) OVER (ORDER BY date_trunc('hour', received_at)) as prev_hour
FROM events
GROUP BY hour
ORDER BY hour DESC;

Advanced Aggregations

sql
-- GROUPING SETS for multi-dimensional analysis
SELECT
    event_type,
    source,
    COUNT(*) as count
FROM events
GROUP BY GROUPING SETS (
    (event_type, source),
    (event_type),
    (source),
    ()
);

Data Exploration

sql
-- Quick table summary
SUMMARIZE events;

-- Column statistics
SELECT
    COUNT(DISTINCT event_type) as unique_types,
    COUNT(DISTINCT source) as unique_sources,
    MIN(received_at) as first_event,
    MAX(received_at) as last_event
FROM events;

PIVOT for Report Generation

sql
-- Pivot events by source and type
PIVOT events
ON event_type
USING COUNT(*)
GROUP BY source;

Window Functions with QUALIFY

sql
-- Latest event per type
SELECT event_type, source, received_at, payload
FROM events
QUALIFY ROW_NUMBER() OVER (PARTITION BY event_type ORDER BY received_at DESC) = 1;

Common Query Patterns

List all event types

bash
duckdb events.duckdb -markdown "SELECT DISTINCT event_type FROM events ORDER BY event_type"

Events in time range

bash
duckdb events.duckdb -markdown "SELECT * FROM events WHERE received_at >= '2024-01-01' AND received_at < '2024-02-01' LIMIT 100"

Count by source and type

bash
duckdb events.duckdb -markdown "SELECT source, event_type, COUNT(*) as count FROM events GROUP BY source, event_type ORDER BY count DESC"

Recent events

bash
duckdb events.duckdb -markdown "SELECT id, event_type, source, received_at FROM events ORDER BY received_at DESC LIMIT 20"

Search JSON payload

bash
duckdb events.duckdb -markdown "SELECT * FROM events WHERE json_extract_string(payload, '$.userId') = '123'"

Metadata Queries

sql
-- Show table information
SELECT * FROM information_schema.tables WHERE table_schema = 'main';

-- Show columns
SELECT * FROM information_schema.columns WHERE table_name = 'events';

-- Show indexes
PRAGMA show_tables;

Query Analysis

Use EXPLAIN to analyze query performance:

bash
duckdb events.duckdb "EXPLAIN SELECT * FROM events WHERE event_type = 'user.created'"

# With execution stats
duckdb events.duckdb "EXPLAIN ANALYZE SELECT COUNT(*) FROM events GROUP BY event_type"

Output to User

After executing queries:

  1. Present results in a readable format (markdown tables work well)
  2. Provide context about what the results show
  3. Suggest follow-up queries if patterns emerge
  4. Warn if result set is truncated

Error Handling

If a query fails:

  • Check database file path exists
  • Verify table/column names
  • Ensure JSON paths are valid
  • Check SQL syntax for DuckDB compatibility

Reference

For more DuckDB SQL features, see: https://duckdb.org/docs/stable/

Key documentation sections: