Messaging & Async Patterns

Description

Design and implement asynchronous communication between services and components using message queues, event buses, and background job systems. This skill covers the universal async patterns — publish/subscribe, work queues, event-driven architecture, and background processing — that apply regardless of whether you use RabbitMQ, Kafka, Azure Service Bus, AWS SQS, or any other messaging system.

When To Use

•An operation takes too long for a synchronous HTTP response (email, PDF generation, video processing)
•Two or more services need to react to the same event without coupling
•You need reliable delivery — the work must happen even if a service is temporarily down
•Designing a microservices architecture and deciding how services communicate
•Implementing retry logic, dead-letter queues, or exactly-once processing
•Decoupling a monolith by replacing direct database reads between modules with events

Prerequisites

•Understanding of client-server architecture and HTTP-based APIs
•Familiarity with the concept of eventual consistency
•Basic understanding of distributed systems trade-offs (CAP theorem at a high level)

Instructions

1. Understand the Core Patterns

Pattern	How it works	Use when
Point-to-Point (Work Queue)	One producer → queue → one consumer (competing consumers for scale)	Background jobs: send email, resize image, process payment
Publish/Subscribe	One publisher → topic → many subscribers (each gets a copy)	Events: order placed, user registered, inventory changed
Request/Reply	Requester sends message + reply address → responder answers on the reply queue	Async RPC: long-running operations that need a result
Event Sourcing	State is derived from an append-only log of events	Audit trails, temporal queries, complex domain state

Start with Work Queue or Pub/Sub. They solve 90% of async needs. Event sourcing is powerful but adds significant complexity.

2. Design Events (Not Commands)

Events describe what happened. Commands describe what to do. Prefer events for inter-service communication:

json

// ✅ Event — describes a fact
{
  "type": "order.placed",
  "data": {
    "orderId": "ord_abc123",
    "customerId": "cust_xyz",
    "total": 149.99,
    "items": [...]
  },
  "metadata": {
    "eventId": "evt_001",
    "timestamp": "2026-02-14T10:30:00Z",
    "source": "order-service",
    "version": 1
  }
}

// ❌ Command — couples producer to consumer's implementation
{
  "action": "sendOrderConfirmationEmail",
  "to": "user@example.com",
  "template": "order-confirm-v2"
}

Why events:

•The producer doesn't need to know who's listening or what they do with it
•New subscribers can be added without changing the producer
•Events are facts — they can be replayed and reprocessed

3. Guarantee Reliable Delivery

Messages can be lost at multiple points. Handle each:

At the producer:

•Use the Outbox Pattern for transactional consistency: write the event to an outbox table in the same database transaction as the business data, then a separate process publishes from the outbox.
•This prevents the "database committed but message not sent" problem.

At the broker:

•Ensure messages are persisted (durable queues/topics)
•Use acknowledgements — don't remove from queue until consumer confirms processing

At the consumer:

•Process idempotently. Messages may be delivered more than once (at-least-once delivery). Design handlers so that processing the same message twice produces the same result.
•Use a processed_events table or check natural idempotency keys.

4. Handle Failures with Dead-Letter Queues

When processing fails after retries, don't lose the message:

code

Main Queue → Consumer (fails) → Retry Queue (with backoff) → Consumer (fails again)
                                                              ↓
                                                        Dead-Letter Queue
                                                              ↓
                                                    Alert + Manual review

Retry strategy:

•Exponential backoff: 1s → 5s → 30s → 5min
•Maximum retries: 3–5 attempts before dead-lettering
•Separate retry queues with TTL for delayed reprocessing
•Dead-letter queue monitoring with alerting — these represent unprocessed work

5. Design for Ordering (When It Matters)

Not all messages need ordering. When they do:

•Partition by entity ID. All events for order_123 go to the same partition/queue. Events for different orders can be parallel.
•Use sequence numbers per entity to detect and handle out-of-order delivery.
•Accept eventual consistency where possible — it dramatically simplifies the system.

When ordering doesn't matter: Stateless operations like sending notifications, generating reports, or processing images independently.

6. Implement Background Jobs

Not every async operation needs a message broker. For internal task scheduling:

•Transactional outbox for critical work (payments, emails)
•Job queue (Sidekiq, BullMQ, Hangfire, Celery) for internal background processing
•Scheduled/cron jobs for periodic tasks (cleanup, reports, sync)

Design job handlers as idempotent, stateless functions with clear inputs: