Notes/Microservices and API Gateway Patterns

Microservices and API Gateway Patterns

How microservices decompose monoliths into independently deployable services, how API gateways and service meshes manage cross-cutting concerns, and when the Backend for Frontend pattern simplifies client integration.

2021-02-16AI-Synthesized from Personal Notes

System DesignMicroservicesApi Gateway

Terminology

Monolith: a single deployable unit containing all application logic, where components share the same process, memory, and database
Microservice: a small, independently deployable service that owns a specific business capability and communicates with other services over the network
API gateway: a single entry point that sits between external clients and internal microservices, handling routing, authentication, rate limiting, and protocol translation
Service mesh: an infrastructure layer of lightweight proxies (sidecars) deployed alongside each service instance, handling service-to-service communication concerns like load balancing, retries, and observability
Sidecar proxy: a proxy process that runs alongside a service instance (in the same pod or host), intercepting all inbound and outbound network traffic transparently
BFF (Backend for Frontend): a pattern where each client type (web, mobile, IoT) gets its own dedicated backend service that aggregates and transforms data from downstream microservices
API composition: a pattern where a service (often the API gateway or a BFF) calls multiple downstream services and combines their responses into a single response for the client
Service discovery: the mechanism by which services find the network addresses of other services, either through a registry (Consul, etcd) or DNS-based resolution
Circuit breaker: a pattern that prevents a service from repeatedly calling a failing downstream service; after a threshold of failures, the circuit "opens" and requests fail fast without attempting the call
Bounded context: a domain-driven design concept that defines the boundary within which a particular domain model applies, often used to determine microservice boundaries
Saga: a pattern for managing distributed transactions across multiple services using a sequence of local transactions with compensating actions for rollback
Strangler fig: a migration pattern where a monolith is incrementally replaced by microservices, routing traffic to new services as they are built while the monolith shrinks over time
Idempotency key: a unique identifier attached to a request that allows the receiver to detect and safely ignore duplicate requests
Fan-out: a pattern where a single request triggers calls to multiple downstream services in parallel

What & Why

A monolithic application starts simple. All code lives in one repository, deploys as one artifact, and shares one database. For small teams and early-stage products, this is the right choice. But as the application grows, the monolith becomes painful. A change to the checkout flow requires redeploying the entire application. A memory leak in the recommendation engine takes down the payment system. The team cannot scale the search service independently from the user profile service.

Microservices address these problems by decomposing the application into small, independently deployable services. Each service owns a specific business capability (orders, payments, inventory, notifications), has its own database, and communicates with other services over the network. Teams can develop, test, deploy, and scale each service independently.

The trade-off is complexity. A monolith has one deployment, one database, and function calls between components. Microservices have dozens or hundreds of deployments, distributed data, and network calls that can fail, time out, or arrive out of order. You need infrastructure to handle routing, authentication, retries, circuit breaking, observability, and service discovery. This is where API gateways and service meshes come in.

An API gateway is the front door. External clients (browsers, mobile apps) talk to the gateway, which routes requests to the appropriate internal services. The gateway handles cross-cutting concerns that every request needs: authentication, rate limiting, request logging, and protocol translation (REST to gRPC, for example).

A service mesh handles the same concerns for internal service-to-service traffic. Instead of embedding retry logic, circuit breakers, and mutual TLS into every service, a sidecar proxy (like Envoy) is deployed alongside each service instance. The proxy intercepts all traffic and applies policies consistently, without the service code knowing about it.

How It Works

Monolith vs Microservices

API Gateway Pattern

The API gateway sits at the edge of the system. External clients send all requests to the gateway, which handles:

Routing: maps external URLs to internal service endpoints (e.g., /api/orders routes to the Orders service).
Authentication: validates tokens or API keys before forwarding requests, so individual services do not need to implement auth.
Rate limiting: enforces request quotas per client or API key.
API composition: for requests that need data from multiple services, the gateway calls them in parallel and merges the responses.
Protocol translation: accepts REST from external clients and translates to gRPC or other internal protocols.

Backend for Frontend (BFF)

A single API gateway serving both web and mobile clients often leads to bloated responses. The web app needs detailed product data with reviews, while the mobile app needs a compact summary. The BFF pattern creates a dedicated backend for each client type. The web BFF aggregates and formats data for the web app. The mobile BFF does the same for mobile, returning smaller payloads optimized for bandwidth constraints.

Service Mesh

A service mesh deploys a sidecar proxy alongside every service instance. All traffic flows through the proxy, which provides:

Mutual TLS: encrypts all service-to-service traffic without application code changes.
Load balancing: distributes requests across instances of the target service.
Retries and timeouts: automatically retries failed requests with configurable backoff.
Circuit breaking: stops sending traffic to unhealthy instances.
Observability: collects metrics, traces, and logs for every request.

The control plane (Istio, Linkerd) configures all the sidecar proxies centrally. Services communicate as if they are making normal network calls, unaware that the mesh is handling reliability and security.

API Composition

When a client needs data from multiple services (user profile + order history + recommendations), the gateway or BFF makes parallel calls to each service and combines the results:

Receive client request.
Fan out: call User Service, Order Service, and Recommendation Service in parallel.
Wait for all responses (with timeouts).
Merge responses into a single payload.
Return to client.

If one downstream service is slow or fails, the composition layer can return a partial response with degraded functionality rather than failing entirely.

Complexity Analysis

Let $s$ be the number of downstream services called during API composition, $t_i$ be the response time of service $i$, and $n$ be the total number of microservices in the system.

Operation	Latency	Notes
Sequential composition	$\sum_{i=1}^{s} t_i$	Each call waits for the previous
Parallel composition	$\max(t_1, t_2, \ldots, t_s)$	All calls execute concurrently
Gateway routing	$O(1)$ or $O(\log n)$	Hash map or prefix-tree lookup
Service discovery lookup	$O(1)$	Cached registry or DNS
Mesh sidecar overhead	$O(1)$ per hop	Typically 1-3ms added latency

The overall availability of a composed request depends on the availability of each downstream service. For $s$ independent services each with availability $a$:

$A_{\text{composed}} = \prod_{i=1}^{s} a_i = a^s \text{ (if all equal)}$

With 5 services each at 99.9% availability, the composed availability drops to $0.999^5 \approx 99.5\%$. This is why retries, circuit breakers, and graceful degradation are essential in microservice architectures.

Implementation

ALGORITHM APIGatewayRoute(request, routeTable)
INPUT: incoming HTTP request, route table mapping paths to services
OUTPUT: response from the target service
BEGIN
  // Authenticate
  token <- EXTRACT_AUTH_TOKEN(request)
  IF NOT VALIDATE_TOKEN(token) THEN
    RETURN 401 Unauthorized
  END IF

  // Rate limit
  clientId <- EXTRACT_CLIENT_ID(token)
  IF RATE_LIMIT_EXCEEDED(clientId) THEN
    RETURN 429 Too Many Requests
  END IF

  // Route to service
  targetService <- LOOKUP routeTable FOR request.path
  IF targetService IS NIL THEN
    RETURN 404 Not Found
  END IF

  // Forward request
  response <- FORWARD(request, targetService)
  RETURN response
END


ALGORITHM APIComposition(request, serviceCalls)
INPUT: client request, list of (serviceName, serviceRequest) pairs
OUTPUT: merged response
BEGIN
  responses <- empty map
  errors <- empty list

  // Fan out: call all services in parallel
  futures <- empty list
  FOR EACH (name, serviceReq) IN serviceCalls DO
    future <- ASYNC_CALL(name, serviceReq, timeout: 2 seconds)
    APPEND (name, future) TO futures
  END FOR

  // Collect results
  FOR EACH (name, future) IN futures DO
    result <- AWAIT future
    IF result IS success THEN
      responses[name] <- result.body
    ELSE
      APPEND name TO errors
      responses[name] <- default fallback for name
    END IF
  END FOR

  merged <- MERGE_RESPONSES(responses)

  IF LENGTH(errors) > 0 THEN
    merged.degraded <- true
    merged.failedServices <- errors
  END IF

  RETURN merged
END


ALGORITHM CircuitBreaker(service, request)
INPUT: target service, outgoing request
STATE: failCount, state (CLOSED/OPEN/HALF_OPEN), lastFailTime
CONSTANTS: THRESHOLD, TIMEOUT_DURATION
OUTPUT: response or fast failure
BEGIN
  IF state = OPEN THEN
    IF NOW() - lastFailTime > TIMEOUT_DURATION THEN
      state <- HALF_OPEN
    ELSE
      RETURN error "circuit open, failing fast"
    END IF
  END IF

  response <- CALL(service, request)

  IF response IS failure THEN
    failCount <- failCount + 1
    lastFailTime <- NOW()
    IF failCount >= THRESHOLD THEN
      state <- OPEN
    END IF
    RETURN error "service call failed"
  ELSE
    IF state = HALF_OPEN THEN
      state <- CLOSED
    END IF
    failCount <- 0
    RETURN response
  END IF
END


ALGORITHM ServiceDiscovery(registry, serviceName)
INPUT: service registry, name of the target service
OUTPUT: list of healthy instance addresses
BEGIN
  instances <- registry.LOOKUP(serviceName)

  healthyInstances <- empty list
  FOR EACH instance IN instances DO
    IF instance.status = HEALTHY THEN
      APPEND instance.address TO healthyInstances
    END IF
  END FOR

  IF LENGTH(healthyInstances) = 0 THEN
    RETURN error "no healthy instances for " + serviceName
  END IF

  RETURN healthyInstances
END

Real-World Applications

E-commerce platforms: companies decompose monolithic stores into services for catalog, cart, checkout, payments, and shipping, each scaling independently during traffic spikes like sales events
Streaming services: video platforms use microservices for user profiles, content metadata, recommendation engines, encoding pipelines, and playback, with an API gateway handling millions of concurrent client connections
Banking and fintech: financial institutions use microservices with strict bounded contexts for accounts, transactions, fraud detection, and compliance, with service meshes providing mutual TLS and audit logging
Mobile backends: apps with web and mobile clients use the BFF pattern to serve optimized payloads to each platform, reducing bandwidth usage on mobile while providing rich data to web dashboards
SaaS platforms: multi-tenant software uses API gateways for tenant isolation, authentication, and usage-based rate limiting, routing requests to shared or dedicated service instances based on the tenant's plan

Key Takeaways

Microservices trade the simplicity of a monolith for independent deployability, technology diversity, and granular scaling, at the cost of distributed system complexity
API gateways centralize cross-cutting concerns (auth, rate limiting, routing) at the edge, preventing duplication across services and simplifying client integration
The BFF pattern gives each client type a dedicated backend that aggregates and shapes data from downstream services, avoiding one-size-fits-all API responses
Service meshes (Envoy, Linkerd, Istio) handle service-to-service reliability (retries, circuit breaking, mTLS) transparently via sidecar proxies, keeping business logic clean
Composed availability drops multiplicatively: with $s$ services at 99.9% each, overall availability is $0.999^s$, making retries, timeouts, and graceful degradation essential

2021-02-17

Rate Limiting and Circuit Breakers

How rate limiting algorithms like token bucket and sliding window protect services from overload, and how circuit breakers prevent cascading failures by failing fast when downstream services are unhealthy.

2021-02-18

CDNs and Edge Computing

How content delivery networks distribute content to edge servers worldwide, how cache hierarchies reduce origin load, and how edge functions move computation closer to users for lower latency.

2021-02-12

Load Balancers: Distributing Traffic Across Servers

How load balancers distribute incoming requests across multiple servers using strategies like round robin, least connections, and consistent hashing, and why the choice between L4 and L7 matters.