Overview
Telemetría local-first para análisis de uso del CLI Trifecta.
Ubicación: _ctx/telemetry/ en cada segmento
Archivos:
- •
events.jsonl- Log crudo de eventos (rotación 5MB) - •
metrics.json- Contadores acumulados - •
last_run.json- Resumen del último run
Métricas Clave
Latency Percentiles
| Métrica | Significado |
|---|---|
| P50 | Latencia típica del usuario |
| P95 | Early warning de degradación |
| P99 | Tail latencia crítica |
| max_ms | Peor caso observado |
Counters (metrics.json)
json
{
"ctx_build_count": N, // Construcciones de pack
"ctx_search_count": N, // Búsquedas totales
"ctx_search_hits_total": N, // Resultados encontrados
"ctx_search_zero_hits_count": N, // Búsquedas sin resultados
"ctx_get_count": N, // Recuperaciones de contexto
"ctx_get_chunks_total": N, // Chunks entregados
"ctx_get_mode_excerpt_count": N, // Modos excerpt vs raw
"prime_links_included_total": N, // Links en prime
"ast_cache_hit_count": N, // AST cache hits
"ast_cache_miss_count": N, // AST cache misses
"ast_parse_count": N // Total AST parses
}
Event Schema (events.jsonl)
json
{
"ts": "ISO-8601",
"run_id": "run_...",
"segment_id": "...",
"cmd": "ctx.search|ctx.get|ctx.sync|ast.symbols|telemetry.report",
"args": {"query": "...", "limit": N, "segment": "."},
"result": {"status": "ok|error", "hits": N, "error_code": "..."},
"timing_ms": N,
"warnings": [],
"x": {} // Extended metadata (cache_status, spanish_alias, etc.)
}
AST Cache Events
- •
ast.cache.hit- Cache hit with backend info - •
ast.cache.miss- Cache miss - •
ast.cache.write- New entry written - •
ast.cache.lock_wait- Waiting for file lock - •
ast.cache.lock_timeout- Lock acquisition timeout
Report Templates
Template 1: Executive Summary (1-2 min)
markdown
## CLI Usage Summary - [Period] **Commands**: N total | [Top commands by %] **Latency**: P50=Xms, P95=Yms **Errors**: N failures | Top: [error types] **Key Insight**: [Single most important finding]
Template 2: Performance Analysis
markdown
## Performance Report ### Latency Distribution - ctx.search: P50=Xms, P95=Yms, max=Zms - ctx.get: P50=Xms, P95=Yms - ctx.build: P50=Xms ### Search Effectiveness - Hit rate: hits/total = X% - Zero-hit searches: N (Y%) - Top query patterns: [...] ### Pack State - SHA: [hash] | Age: [time] | Stale: [bool]
Template 3: Trend Analysis (Multi-period)
markdown
## Usage Trends [Period 1] vs [Period 2] **Growth**: +X% commands | +Y% active runs **Performance**: P50 changed Xms | P95 changed Yms **Patterns**: [New behaviors, regression warnings]
CLI Commands
bash
# Generate reports trifecta telemetry report -s . --last 30 # Last 30 days trifecta telemetry health -s . # System health check trifecta telemetry export -s . --format json # Export raw data trifecta telemetry chart -s . --type hits # ASCII chart: hits|latency|commands
Analysis Commands (jq)
bash
# Extract metrics from events.jsonl
jq -r '.cmd' events.jsonl | sort | uniq -c | sort -rn
# Average timing by command
jq -s 'group_by(.cmd) | map({cmd: .[0].cmd, avg: (map(.timing_ms) | add / length)})' events.jsonl
# Check for errors
jq 'select(.result.status != "ok")' events.jsonl
# Zero-hit searches rate
jq '[select(.cmd=="ctx.search")] | map(.result.hits==0) | length / length * 100' events.jsonl
# AST cache hit rate
jq '[select(.cmd=="ast.cache.hit")] | length' events.jsonl
jq '[select(.cmd=="ast.cache.miss")] | length' events.jsonl
# Spanish alias recovery events
jq 'select(.cmd=="ctx.search.spanish_alias")' events.jsonl
Red Flags
| Pattern | Meaning | Action |
|---|---|---|
| P95 > 2× P50 | Tail latency degradation | Investigate outliers |
| zero-hit > 40% | Poor search queries | Check query patterns |
| warnings recurring | Systemic issue | Fix root cause |
| pack stale | Context outdated | Rebuild pack |
| ast.cache.lock_timeout > 0 | File lock contention | Review concurrent access |
| cache_hit_rate < 50% | Poor cache utilization | Check cache configuration |
Spanish Aliases Analysis
When analyzing search effectiveness, check for Spanish alias recovery:
bash
# Check alias recovery success rate
jq 'select(.cmd=="ctx.search.spanish_alias" and .result.recovered==true)' events.jsonl
# Compare pass1 vs pass2 hits
jq 'select(.cmd=="ctx.search.spanish_alias") | {query: .args.query_preview, pass1: .result.pass1_hits, pass2: .result.pass2_hits}' events.jsonl
Metrics to track:
- •Recovery rate: % of queries recovered via aliases
- •Hit improvement: Average hit increase from pass1 to pass2
- •Top failed queries: Spanish terms that still return zero hits
Best Practices
- •Start with Executive Summary → Si se necesita detalle, ir a Performance Analysis
- •Compare períodos → Trends > snapshots absolutos
- •Investigate outliers → Un evento malo puede sesgar P95
- •Correlate metrics → latency vs search effectiveness vs errors
- •Check AST cache → Verify cache_hit_count increases with repeated symbol extraction
- •Monitor Spanish aliases → Ensure recovery rate > 60% for Spanish queries
Related Skills
- •
trifecta_dope- Main Trifecta skill for context operations - •
telemetry_analysis/skills/analyze- Concise telemetry report generation
References
- •CLI Telemetry Best Practices
- •P50/P95/P99 Latency Guide
- •Agent Monitoring Patterns
- •Trifecta Documentation:
docs/telemetry/