Dump Agent Data for rg
[Created by Claude: 7f04f921-3ad9-48b4-a951-f8227c466a3e]
Relationship to Other Skills
Used by: search-agent-conversation (this is the prerequisite)
This skill provides technical details about the dump tool. For search strategies and when to use this tool, see search-agent-conversation.
❌ CRITICAL WARNING
NEVER grep sse_lines.jsonl directly!
✅ Always use this dump tool first, then rg the output.
Quick Start
# Last hour (both agents, auto-generated unique dir) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --start-hour 1 --end-hour 0 # Last 24 hours, codex only python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --start-hour 24 --end-hour 0 --agent codex # Precise time window (local naive time) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --since-time "2026-01-23 10:00:00" --to-time "2026-01-23 12:00:00" # From specific time to NOW (note the printed --to-time!) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --since-time "2026-01-23 10:00:00"
Sample Output
Time window: 2026-01-23 12:00:00 → 2026-01-23 12:42:47 (42m 47s)
# --to-time defaulted to: 2026-01-23 12:42:47 (use this for next --since-time)
Output: /tmp/agent-dump-conversations/codex-and-claude-0.7h-019be960-66bc-7b2d-a0d3-3ed0f1d36221
Codex 0 sessions → /tmp/agent-dump-conversations/.../codex/
Claude 1 session → /tmp/agent-dump-conversations/.../claude/
codex-and-claude-0.7h-019be960-66bc-7b2d-a0d3-3ed0f1d36221/
├── codex/
└── claude/
└── 7f04f921-3ad9-48b4-a951-f8227c466a3e-0/
└── conversation.txt
# Search conversations:
rg "vscode extension" /tmp/agent-dump-conversations/codex-and-claude-0.7h-...
# Reminder: Do NOT search sse_lines.jsonl directly, use this dump instead!
Total size: 0.11 MB
Advanced Usage: Git Diff Workflow
⚠️ MUST use --since-time + --to-time for this workflow
WHY: Using relative hours (--start-hour) causes data loss between dumps. If a dump takes 10 seconds, you lose 10 seconds of conversations between runs.
Step-by-Step
# Step 1: Initial 24h dump (note the exact --to-time printed!) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --since-time "2026-01-22 12:00:00" \ --to-time "2026-01-23 12:00:00" # Output shows: Time window: 2026-01-22 12:00:00 → 2026-01-23 12:00:00 cd /tmp/agent-dump-conversations/codex-and-claude-24h-xxxxx git init && git add . && git commit -m "24h baseline" # Step 2: Later dump (use EXACT --to-time from previous as --since-time) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --since-time "2026-01-23 12:00:00" # Omit --to-time = dumps up to NOW # Output shows: # --to-time defaulted to: 2026-01-23 14:30:15 (use this for next --since-time) # Copy new data into git repo cp -r /tmp/agent-dump-conversations/codex-and-claude-*/codex/* . cp -r /tmp/agent-dump-conversations/codex-and-claude-*/claude/* . # Step 3: git diff shows only NEW conversations (no loss!) git diff
Key insight: The printed --to-time defaulted to: message tells you the exact cutoff. Use that value as --since-time for the next dump to ensure no gaps.
Script Locations
Wrapper (recommended)
~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py
Individual scripts
~/swe/telemetry_projects/agent_dump/launchers/dump_codex_conversations.py ~/swe/telemetry_projects/agent_dump/launchers/dump_claude_conversations.py
Symlinks exist in both projects
~/swe/telemetry_projects/codex_sqlite/launchers/dump_conversations.py ~/swe/telemetry_projects/claude_sqlite/launchers/dump_conversations.py
Command Reference
Time Modes (mutually exclusive)
| Mode | Flags | Example |
|---|---|---|
| Relative | --start-hour + --end-hour | --start-hour 24 --end-hour 0 (last 24h) |
| Precise | --since-time [+ --to-time] | --since-time "2026-01-23 10:00:00" |
Time Formats
| Flag | Format | Default | Example |
|---|---|---|---|
--start-hour | Float | Required | 24, 1.5 |
--end-hour | Float | Required | 0, 0.5 |
--since-time | YYYY-MM-DD HH:MM:SS | Required | "2026-01-23 10:00:00" |
--to-time | YYYY-MM-DD HH:MM:SS | now | "2026-01-23 12:00:00" |
Note: Times are local naive (no timezone suffix).
Other Flags
| Flag | Description | Example |
|---|---|---|
--agent | Which agent(s): codex, claude, both (default: both) | --agent codex |
--data-dir | Custom output directory (default: auto-generated) | --data-dir /tmp/my-dump |
--sid | Filter by session ID suffix (can specify multiple) | --sid abc123 --sid def456 |
--include-reasoning | Include thinking/reasoning blocks (disabled by default) | --include-reasoning |
-q, --quiet | Suppress output | -q |
About Reasoning/Thinking Content
Disabled by default because reasoning is transient and exploratory:
- •Agents exploring possibilities, not final decisions
- •Often discarded or revised in final output
- •Can add noise when searching for what agents actually did
When to include (--include-reasoning):
- •Debugging agent decision-making process
- •Understanding why agents chose specific approaches
- •Analyzing agent behavior patterns
Note: Tool calls are included by default (they show what agents actually executed).
Output Structure
/tmp/agent-dump-conversations/codex-and-claude-24h-{uuid}/
├── codex/
│ ├── {sid}-{pid}/
│ │ └── conversation.txt
│ └── ...
└── claude/
├── {sid}-{pid}/
│ └── conversation.txt
└── ...
Each conversation.txt contains:
- •Agent type (CODEX or CLAUDE)
- •Session ID and PID
- •Round-by-round conversations
- •User prompts
- •Assistant responses
- •Timestamps
⚠️ IMPORTANT: Direct SQLite Querying Protocol
Agents are heavily discouraged from querying SQLite databases directly, especially for metadata-only queries. Use the dump tool instead.
If You Must Query SQLite Directly
When querying ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite or ~/centralized-logs/sqlite-dbs/claude-rounds.sqlite:
Timeout Rules:
- •Maximum timeout: 16 seconds
- •On each timeout: Halve the timeout (16s → 8s → 4s → 2s → 1s)
- •After multiple timeouts: Stop and report the issue
Background Execution (Recommended):
- •Run queries in a background terminal if possible
- •Background terminals must also comply with the timeout rule
- •Use
timeout 16s sqlite3 ...to enforce limits
Example:
# With timeout protection timeout 16s sqlite3 ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite "SELECT COUNT(*) FROM sessions" # If timeout occurs, retry with 8s timeout 8s sqlite3 ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite "SELECT COUNT(*) FROM sessions"
Why this matters: SQLite can lock under concurrent writes. The dump tool is optimized for safe, read-only access to conversation data.
🚀 FAST QUERY: --prompt-only Mode
When you only need user prompts with timestamps, use --prompt-only instead of querying SQLite directly:
# Get all prompts from last 24 hours (JSON to stdout) python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --start-hour 24 --end-hour 0 --prompt-only # Get codex-only prompts from precise time window python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \ --since-time "2026-01-23 10:00:00" --agent codex --prompt-only
Example Output
[
{
"agent": "claude",
"sid": "03d9e752-394b-477b-84d5-b7770322c09f",
"pid": 0,
"prompt_count": 3,
"user_prompts": [
{"t": "2026-01-23 12:00:01", "prompt": "Upgrade the repo to have more observability"},
{"t": "2026-01-23 12:15:30", "prompt": "Add tests for the new feature"},
{"t": "2026-01-23 12:45:00", "prompt": "Fix the failing CI"}
]
},
{
"agent": "codex",
"sid": "019be4a6-fa69-7d91-a963-ecca7d9185c6",
"pid": 12351,
"prompt_count": 2,
"user_prompts": [
{"t": "2026-01-23 11:30:00", "prompt": "Port the documentation"},
{"t": "2026-01-23 12:00:00", "prompt": "Update the changelog"}
]
}
]
Use Cases
| Task | Use --prompt-only |
|---|---|
| List all agents that worked on project X | ✅ Yes - Parse JSON, filter by prompt keywords |
| Get timeline of user requests | ✅ Yes - Sort by timestamp |
| Find sessions with specific prompts | ✅ Yes - Much faster than full dump |
| Search assistant responses or tool calls | ❌ No - Use full dump + rg |
⚠️ IMPORTANT: Use This Instead of SQLite Queries!
When asked to "find agents that participated in X" or "get all user prompts", ALWAYS use --prompt-only rather than querying SQLite directly:
- •✅
python dump_conversations.py --start-hour 72 --end-hour 0 --prompt-only - •❌
sqlite3 codex-rounds.sqlite "SELECT sid, prompt FROM rounds WHERE..."
The --prompt-only flag:
- •No files written (JSON to stdout)
- •Much faster than full dump
- •No risk of SQLite lock issues
- •Easy to parse with jq or Python
Basic rg Commands (Quick Reference)
# Basic search rg "keyword" /tmp/agent-dump-conversations/codex-and-claude-*/ # Search only codex sessions rg "keyword" /tmp/agent-dump-conversations/codex-and-claude-*/codex/ # List matching files only rg "keyword" -l /tmp/agent-dump-conversations/codex-and-claude-*/ # Search with context rg -C 5 "keyword" /tmp/agent-dump-conversations/codex-and-claude-*/ # Case-insensitive rg -i "keyword" /tmp/agent-dump-conversations/codex-and-claude-*/
For search strategies and when to regenerate data, see the search-agent-conversation skill.