JSONL Item Reader
Quick Start
Extract specific items by ID:
bash
# Single ID python .claude/skills/read-jsonl/scripts/reader.py data/processed/xsum_group_h.jsonl --ids xsum_0 # Multiple IDs python .claude/skills/read-jsonl/scripts/reader.py data/processed/xsum_group_h.jsonl --ids xsum_0 xsum_5 xsum_10 # Comma-separated IDs python .claude/skills/read-jsonl/scripts/reader.py data/processed/xsum_group_h.jsonl --ids "xsum_0,xsum_5,xsum_10"
Display Options
Choose what to display:
bash
# Show all fields (default) python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 # Show only specific fields python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --fields id text_human # Show only text fields (human, ai_base, humanized) python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --text-only # Compact view (metadata only, no text content) python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --compact # Pretty print with better formatting python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --pretty
Output Formats
bash
# Human-readable (default) python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 # JSON output python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --format json # JSON Lines (one item per line) python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --format jsonl # Export to file python .claude/skills/read-jsonl/scripts/reader.py file.jsonl --ids xsum_0 --output output.json
Common Use Cases
Compare text lengths across stages:
bash
python .claude/skills/read-jsonl/scripts/reader.py \ data/processed/xsum_group_h_ai_gpt-4o_humanized_gpt-4o.jsonl \ --ids xsum_0 \ --fields id text_human text_ai_base text_ai_humanized \ --stats
Inspect error cases:
bash
python .claude/skills/read-jsonl/scripts/reader.py \ data/processed/xsum_group_h_ai_gpt-4o.jsonl \ --ids xsum_42 \ --fields id generation_status error text_ai_base
Extract detection scores:
bash
python .claude/skills/read-jsonl/scripts/reader.py \ data/processed/xsum_group_h_winston_text_human.jsonl \ --ids xsum_0 \ --fields id ai_probability score
Features
The reader provides:
- •🎯 Exact ID matching - Extract specific entries by ID
- •📊 Length statistics - Character and word counts for text fields
- •🔍 Flexible display - Show all fields, specific fields, or text-only
- •💾 Multiple formats - Human-readable, JSON, or JSONL output
- •📝 Pretty printing - Formatted text with line numbers and truncation
- •⚡ Fast lookup - Efficient ID-based extraction
Field Groups
Common field selections:
- •Metadata:
id dataset chunk_type - •Text (Group H):
text_human - •Text (Group A):
text_human text_ai_base - •Text (Group B):
text_human text_ai_base text_ai_humanized - •Generation:
generation_status error was_truncated model - •Humanization:
humanization_status humanizer_error humanizer_was_truncated humanizer_model - •Detection:
ai_probability score status_code
Integration
For programmatic usage and batch processing, see API.md.
For detailed examples and patterns, see GUIDE.md.
Typical Workflow
bash
# 1. Find problematic IDs with analyzer python .claude/skills/analyze-jsonl/scripts/analyzer.py data/processed/xsum_ai.jsonl # 2. Read specific entries to investigate python .claude/skills/read-jsonl/scripts/reader.py data/processed/xsum_ai.jsonl --ids xsum_42 --pretty # 3. Compare across pipeline stages python .claude/skills/read-jsonl/scripts/reader.py data/processed/xsum_humanized.jsonl --ids xsum_42 --text-only