Microservices Architecture

Overview

Microservices architecture structures an application as a collection of loosely coupled, independently deployable services, each organized around a business capability. Each service owns its data, runs in its own process, and communicates over the network.

The canonical reference is Sam Newman's Building Microservices (O'Reilly, 2nd edition 2021), supplemented by Monolith to Microservices (Newman, 2019) for migration strategies.

The First Rule of Microservices: Don't start with microservices. Start with a monolith, understand your domain, and decompose when you have evidence that the benefits outweigh the operational costs. (See dev/architecture/monoliths for the monolith-first approach.)

Service Decomposition

By Business Capability

Align services to what the business does (e.g., Order Management, Inventory, Payments). This creates stable boundaries because business capabilities change less frequently than technical layers.

By Subdomain (DDD-Aligned)

Use Domain-Driven Design bounded contexts as service boundaries (see dev/architecture/domain-driven-design):

•Core subdomains -- Your competitive advantage; build custom services.
•Supporting subdomains -- Necessary but not differentiating; simpler services or libraries.
•Generic subdomains -- Commodity; buy or use off-the-shelf (auth, email, payments).

Decomposition Heuristics

Heuristic	Description
Single Responsibility	Each service does one thing well
Data ownership	Each service owns its data; no shared databases
Independent deployability	Changing one service does not require deploying another
Team alignment	One team can own and operate the service end-to-end
Bounded context boundary	Service boundaries align with DDD bounded contexts

Inter-Service Communication

Synchronous Communication

Pattern	Protocol	When to Use
Request/Response (REST)	HTTP/JSON	Simple CRUD, external APIs, broad tooling support
Request/Response (gRPC)	HTTP/2 + Protobuf	Internal service-to-service; high throughput, strong typing, streaming
GraphQL	HTTP/JSON	Client-driven queries; aggregating multiple services for a frontend

Asynchronous Communication

Pattern	Mechanism	When to Use
Event Notification	Message broker (topic/pub-sub)	Decoupled notification; consumers decide what to do
Event-Carried State Transfer	Message broker with payload	Reduce synchronous callbacks; consumer has needed data
Command Message	Message broker (queue)	Tell a specific service to do something
Async Request/Response	Correlation ID + reply queue	Need a response but don't want to block

Rule of thumb: Prefer asynchronous communication for inter-service calls. Use synchronous only when a real-time response is required (e.g., user-facing request/response).

Communication Anti-Patterns

•Distributed monolith -- Services are "microservices" in name only; they deploy together, share databases, or cannot function independently.
•Chatty interfaces -- Excessive synchronous calls between services creating latency chains.
•Shared database -- Multiple services reading/writing the same tables destroys independent deployability.

API Gateway

An API gateway sits between external clients and internal services, providing:

•Request routing -- Routes client requests to the appropriate microservice
•Protocol translation -- External REST to internal gRPC, for example
•Authentication/Authorization -- Centralized security enforcement
•Rate limiting and throttling -- Protect services from traffic spikes
•Response aggregation -- Combine responses from multiple services for a single client call

Common implementations: Kong, AWS API Gateway, Azure API Management, Envoy, NGINX, Ocelot (.NET).

Service Mesh

A service mesh handles service-to-service networking concerns transparently via sidecar proxies:

code

┌──────────────────────┐    ┌──────────────────────┐
│  Service A           │    │  Service B           │
│  ┌────────────────┐  │    │  ┌────────────────┐  │
│  │ App Container  │  │    │  │ App Container  │  │
│  └───────┬────────┘  │    │  └───────▲────────┘  │
│          │           │    │          │           │
│  ┌───────▼────────┐  │    │  ┌───────┴────────┐  │
│  │ Sidecar Proxy  │──┼────┼─▶│ Sidecar Proxy  │  │
│  │ (Envoy)        │  │    │  │ (Envoy)        │  │
│  └────────────────┘  │    │  └────────────────┘  │
└──────────────────────┘    └──────────────────────┘
         Control Plane (Istio / Linkerd)

Capabilities: Mutual TLS, traffic management, retries, circuit breaking, observability (distributed tracing, metrics), canary deployments.

Implementations: Istio, Linkerd, Consul Connect, AWS App Mesh.

Saga Pattern -- Distributed Transactions

Since each microservice owns its data, distributed transactions (2PC) are impractical. The saga pattern manages data consistency across services through a sequence of local transactions with compensating actions.

Choreography (Event-Driven)

Each service publishes events that trigger the next step. No central coordinator.

code

Order Service ──(OrderCreated)──▶ Payment Service
Payment Service ──(PaymentProcessed)──▶ Inventory Service
Inventory Service ──(InventoryReserved)──▶ Shipping Service

On failure:
Inventory Service ──(ReservationFailed)──▶ Payment Service (refund)
Payment Service ──(RefundProcessed)──▶ Order Service (cancel)

Pros: Simple, decoupled, no single point of failure. Cons: Hard to understand the overall flow; debugging is difficult; risk of cyclic dependencies.

Orchestration (Central Coordinator)

A saga orchestrator (process manager) coordinates the steps explicitly.

code

┌─────────────────┐
│ Saga Orchestrator│
│ (Order Saga)     │
└────┬───┬───┬────┘
     │   │   │
     ▼   ▼   ▼
  Payment  Inventory  Shipping
  Service  Service    Service

Pros: Clear flow, easier to understand and debug, centralized compensation logic. Cons: Orchestrator is a coupling point; risk of becoming a "god service."

Guidance: Use choreography for simple sagas (2-3 steps). Use orchestration for complex flows (4+ steps or complex compensation).

Distributed Data Management

Pattern	Description
Database per Service	Each service has its own database; no shared access
API Composition	Query multiple services and aggregate results
CQRS	Separate read and write models for different optimization (see `dev/architecture/event-driven`)
Event Sourcing	Store state changes as events; derive current state (see `dev/architecture/event-driven`)
Saga	Manage distributed transactions through compensating actions
Outbox Pattern	Reliably publish events by writing to a local outbox table within the same transaction

Service Discovery

Services need to find each other in a dynamic environment where instances come and go.

Approach	Examples	Mechanism
Client-side discovery	Netflix Eureka, Consul	Client queries registry, picks instance
Server-side discovery	AWS ALB, Kubernetes Services	Load balancer/proxy routes to available instance
DNS-based	Consul DNS, Kubernetes CoreDNS	Resolve service name to IP(s) via DNS

In Kubernetes environments, server-side discovery via Services and DNS is the default and usually sufficient.

Resilience Patterns

Pattern	Purpose
Circuit Breaker	Stop calling a failing service; fail fast and allow recovery
Retry with Backoff	Retry transient failures with exponential backoff and jitter
Bulkhead	Isolate failures to prevent cascading (separate thread pools / connections)
Timeout	Set explicit timeouts on all remote calls; never wait forever
Fallback	Provide degraded but functional response when a service is unavailable
Health Check	Expose liveness and readiness endpoints for orchestrators

When NOT to Use Microservices

Microservices introduce significant operational complexity. Do not use them when:

•Your team is small (< 8-10 developers) -- The overhead exceeds the benefit.
•Your domain is not well understood -- You will draw the wrong boundaries and create a distributed monolith.
•You lack operational maturity -- You need CI/CD, monitoring, distributed tracing, container orchestration, and on-call practices before microservices are viable.
•Latency is critical -- Every network hop adds latency; monoliths have zero network overhead for internal calls.
•Strong consistency is required everywhere -- Microservices embrace eventual consistency; if your domain requires ACID transactions across multiple entities, a monolith may be simpler.
•You are building an MVP or prototype -- Speed of iteration matters more than scalability at this stage.

Tradeoffs Summary

Benefit	Cost
Independent deployability	Operational complexity (CI/CD per service, monitoring, tracing)
Technology heterogeneity	Polyglot overhead; harder to maintain standards
Team autonomy	Coordination overhead; contract management
Scalability per service	Network latency; serialization/deserialization cost
Fault isolation	Distributed failure modes (partial failures, network partitions)
Organizational alignment	Requires mature DevOps culture

Best Practices

•Design for failure from day one: circuit breakers, retries, timeouts, bulkheads.
•Own your data: one database per service, no shared database access.
•Make inter-service communication observable: distributed tracing (OpenTelemetry), centralized logging, metrics.
•Use consumer-driven contract testing (Pact, Spring Cloud Contract) to prevent breaking changes.
•Prefer asynchronous communication; use synchronous calls only when necessary.
•Keep services small enough to be owned by a single team, but large enough to justify the operational overhead.
•Deploy independently, test independently, fail independently.