Loki Logging Skill

What I Do

•Write and optimize LogQL queries (log queries and metric queries)
•Design effective log label strategies with low cardinality
•Configure structured JSON logging with correlation fields
•Create log-based metrics and alerting rules
•Correlate logs with distributed traces using traceID
•Troubleshoot common Loki query errors and performance issues

When to Use Me

Use this skill when you:

•Write, build, or debug LogQL queries for Loki
•Query logs during incident investigation
•Design or refactor log label strategies
•Create dashboards with log-based metrics
•Set up alerts based on log patterns or volumes
•Correlate logs with traces for distributed tracing
•Troubleshoot "maximum series reached" or timeout errors

LogQL Patterns

Log Queries

logql

# Stream selector with filters
{service_name="api", env="prod"} |= "error" != "timeout"

# JSON parsing and filtering
{job="api"} | json | status >= 500 and method = "POST"

# Logfmt parsing
{job="api"} | logfmt | duration > 10s

# Pattern matching (fast, no regex overhead)
{job="nginx"} | pattern `<ip> - - <_> "<method> <uri> <_>" <status> <size>`

# Regex parsing (flexible)
{job="api"} | regexp `(?P<method>\w+) (?P<path>[\w/]+) \((?P<status>\d+)\)`

# Custom output format
{job="api"} | json | line_format "{{.method}} {{.path}} [{{.status}}]"

# Filter by structured metadata (traceID) - requires structured_metadata configured
{job="api"} | json | traceId="abc123" | level="error"

# If traceId is a label (not recommended due to cardinality):
{job="api", traceId="abc123"} | level="error"

Metric Queries

logql

# Error rate per service
sum by (service) (rate({env="prod"} |= "error" [5m]))

# P95 latency from logs
quantile_over_time(0.95, {job="api"} | json | unwrap latency_ms [5m]) by (service)

# Top endpoints by traffic
topk(10, sum by (path) (rate({job="api"} | json [1h])))

# Detect missing logs
absent_over_time({service="critical-api"}[15m])

Label Design

Good Static Labels

•service_name, env, cluster, namespace, team
•Keep 5-10 labels per stream, < 100k active streams

Avoid as Labels (Use Structured Metadata)

•Request IDs, trace IDs, user IDs, IP addresses
•Any unbounded/high-cardinality values

Structured Logging

Required JSON Fields

json

{
  "timestamp": "2024-01-15T10:30:00.000Z",
  "level": "error",
  "service": "user-api",
  "message": "Request failed",
  "traceId": "abc123xyz",
  "spanId": "span456",
  "duration_ms": 1234
}

Log Levels

Use consistent levels: debug, info, warn, error, fatal

Structured Metadata Config (Promtail)

yaml

pipeline_stages:
  - json:
      expressions:
        trace_id: traceId
  - structured_metadata:
      trace_id:

Context7 Integration

For up-to-date Loki documentation:

code

context7_resolve-library-id: grafana loki
context7_query-docs: libraryId=/grafana/loki, query="LogQL metric queries"

Alerting Examples

yaml

# Error rate alert
- alert: HighErrorRate
  expr: |
    sum(rate({env="prod"} |= "error" [5m])) by (service)
      / sum(rate({env="prod"}[5m])) by (service) > 0.05
  for: 10m
  labels:
    severity: critical
  annotations:
    summary: "{{ $labels.service }} error rate exceeds 5%"

# Missing logs alert
- alert: NoLogsReceived
  expr: absent_over_time({service="critical-api"}[15m])
  for: 15m
  labels:
    severity: warning
    
# Recording rule for metrics
- record: service:request_rate:1m
  expr: sum by (service) (rate({env="prod"} | json [1m]))

Common Errors

Error	Cause	Solution
"maximum of series reached"	Too many time series	Add label filters, use `keep` stage
"context deadline exceeded"	Query timeout	Filter early, narrow time range
High cardinality warning	Label has too many values	Move to structured metadata
Parse errors	Format mismatch	Check format, filter `__error__=""`

Query Debugging

logql

# Find parsing errors
{job="api"} | json | __error__ != ""

# Show only successful parses
{job="api"} | json | __error__ = ""

# Debug which lines fail parsing
{job="api"} | json | line_format "error={{.__error__}} line={{__line__}}"

Labels vs Parsed Fields vs Structured Metadata

Type	Syntax	Cardinality	Use For
Labels	`{label="value"}`	Low (<100 values)	service, env, namespace
Parsed fields	`	json	field="value"`
Structured metadata	Configured in pipeline	Medium	traceId, requestId

Performance Tips

•Filter early: Line filters (|=) before parsers
•Specific selectors: Narrow streams with labels first
•Avoid regex: Prefer exact string matches
•Use structured metadata: For trace/request IDs
•Recording rules: Pre-compute expensive metrics

Related Skills

Skill	Relationship
grafana-dashboards	Visualize Loki data
prometheus-alerting	Alert on log-derived metrics
opentelemetry-tracing	Correlate logs with traces

References

Document	Use When
research.md	Deep dive into LogQL patterns and configuration