OpenTelemetry Demo Architecture
Quick reference for the OpenTelemetry Demo microservices system. Focus on service dependencies, critical paths, and common failure patterns.
For detailed observability queries: See the Observability Query Guides section below for comprehensive metrics, traces, and logs references.
Service Dependency Matrix
| Service | Language | Depends On | Protocol | Memory Limit |
|---|---|---|---|---|
| frontend | TypeScript | ad, cart, checkout, currency, product-catalog, recommendation, shipping, image-provider | gRPC | 250M |
| checkout | Go | cart, currency, email, payment, product-catalog, shipping, kafka | gRPC, HTTP | 20M |
| cart | .NET | valkey-cart, flagd | - | 160M |
| product-catalog | Go | flagd | - | 20M |
| recommendation | Python | product-catalog, flagd | gRPC | 50M |
| shipping | Rust | quote | HTTP | 20M |
| payment | JavaScript | flagd | - | 120M |
| ad | Java | flagd | gRPC | 300M |
| Ruby | - | - | 100M | |
| currency | C++ | - | - | 20M |
| quote | PHP | - | - | 40M |
| fraud-detection | Kotlin | kafka, flagd | TCP | - |
| product-reviews | Python | product-catalog, llm, postgresql, flagd | gRPC | - |
| accounting | .NET | kafka, postgresql | TCP | - |
| frontend-proxy | Envoy | frontend, flagd, flagd-ui, image-provider | HTTP | 65M |
| image-provider | nginx | - | - | 120M |
| load-generator | Python | frontend-proxy, flagd | HTTP | 120M |
Critical Service Paths
User Request Flow:
Internet → frontend-proxy:ENVOY_PORT → frontend → [cart, product-catalog, recommendation, checkout]
Checkout/Order Flow:
checkout → cart (gRPC)
→ currency (gRPC)
→ payment (gRPC)
→ product-catalog (gRPC)
→ shipping (HTTP) → quote (HTTP)
→ email (HTTP)
→ kafka → [accounting, fraud-detection]
Telemetry Flow:
All services → otel-collector:4317(gRPC)/4318(HTTP) → [jaeger, prometheus, tempo, opensearch]
→ grafana (visualization)
Observability Stack Details
Metrics Pipeline
Services → otel-collector (OTLP) → prometheus (scrape/remote-write) → grafana
Traces Pipeline
Services → otel-collector (OTLP) → [jaeger, tempo] → grafana
Logs Pipeline
Services → docker json-file → alloy → loki → grafana
Log Configuration
All services use json-file driver:
- •Max size: 5M
- •Max files: 2
- •Auto-rotation
Observability Query Guides
Detailed reference guides for querying observability data:
Metrics Guide
See metrics-guide.md for:
- •Complete Prometheus metrics catalog - All 260+ available metrics organized by category
- •Service-specific metrics - Application metrics for each demo service
- •HTTP/RPC metrics - Request rates, latencies, error rates
- •Runtime metrics - Go, .NET, JVM, Node.js, Python runtime instrumentation
- •Container metrics - CPU, memory, network, disk I/O
- •Feature flag metrics - Flag evaluation and impression tracking
- •OTEL Collector metrics - Pipeline health and throughput
- •Common label patterns - Service identification, filtering, aggregation
- •Query patterns - Request rates, error rates, percentiles, top-N
When to use: Building dashboards, investigating performance issues, analyzing resource utilization, monitoring service health.
Traces Guide
See traces-guide.md for:
- •TraceQL syntax and patterns - Complete query language reference
- •Resource attributes - Service, host, process, runtime identification (27 attributes)
- •Span attributes - HTTP, gRPC, database, application-specific (150+ attributes)
- •Event attributes - Exceptions, feature flags, business events
- •Intrinsic attributes - Duration, status, kind, instrumentation
- •Service-specific attributes - Cart, product, payment, shipping, ad service patterns
- •Common query patterns - Error investigation, performance analysis, feature flag impact
When to use: Debugging distributed transactions, investigating latency issues, understanding service dependencies, tracing business transactions.
Logs Guide
See logs-guide.md for:
- •LogQL syntax and patterns - Complete query language reference
- •Available labels - service_name, container, project (5 labels)
- •Service identification - All 16 application and infrastructure services
- •Common query patterns - Errors, HTTP requests, performance, business events
- •Log parsing - JSON parsing, pattern extraction, label filtering
- •Aggregations - Count, rate, percentiles, error percentages
- •Multi-service analysis - Cross-service errors, communication tracing
When to use: Investigating errors, debugging application logic, analyzing request patterns, root cause analysis.