Building Resilient Microservices: Circuit Breakers and Rate Limiting Patterns

Building Resilient Microservices: Circuit Breakers and Rate Limiting Patterns

Jin LarsenBy Jin Larsen
Architecture & Patternsmicroservicescircuit-breakerrate-limitingsystem-designresilience

Distributed systems fail in spectacular ways. A single slow database query can cascade into a full-blown outage across ten services. This post covers two defensive patterns that keep microservices upright when dependencies misbehave: circuit breakers (which stop calling failing services) and rate limiting (which prevent overload). You'll learn how each works, when to apply them, and what tools exist for Node.js, Python, and Go environments.

What Is a Circuit Breaker in Microservices?

A circuit breaker is a proxy that sits between your service and an upstream dependency. It monitors for failures. When errors exceed a threshold, the breaker trips. Requests stop flowing to the broken dependency. Your service returns a fallback response instead.

The pattern comes from electrical engineering. Too much current? The breaker cuts power before the house burns down. In software, too many 500 errors or timeouts trigger the same protective shutdown.

There are three states. Closed means everything's normal—requests pass through. Open means failure threshold hit—requests fail fast. Half-open lets a trickle of requests test the waters. If they succeed, the breaker closes again. If not? Back to open.

Here's the thing: without this pattern, your service keeps hammering a dying dependency. Threads pool up. Memory spikes. Eventually, your healthy service collapses—what engineers call a cascading failure. The circuit breaker contains the blast radius.

Netflix popularized this approach with Hystrix (now in maintenance mode). Today, Resilience4j serves Java shops well. Node.js developers often reach for opossum. Python has pybreaker. Each implements the same core concept with language-specific ergonomics.

How Does Rate Limiting Protect Your APIs?

Rate limiting caps how many requests a client can make in a time window. Exceed the limit? The API returns 429 Too Many Requests. This prevents any single user (or rogue service) from monopolizing resources.

The math matters. A service handling 1000 req/s might crumble at 10,000. Rate limits act as pressure valves. They buy time for autoscaling to kick in—or reject traffic gracefully rather than falling over completely.

Common algorithms include:

  • Token bucket: Credits accumulate over time. Each request spends a credit. Simple, allows small bursts.
  • Fixed window: Count requests per calendar minute/hour. Easy to implement but vulnerable to traffic spikes at window boundaries.
  • Sliding window log: Tracks each request timestamp. Precise but memory-heavy.
  • Sliding window counter: Hybrid approach—approximates the sliding window without storing every timestamp.

Worth noting: rate limiting and circuit breaking solve different problems. Rate limiting prevents overload. Circuit breakers handle failures. You'll want both.

Implementation varies by infrastructure. AWS API Gateway has built-in throttling. NGINX handles it at the edge. Application-level limits (using Redis or in-memory stores) offer more flexibility but add latency.

When Should You Implement Both Patterns Together?

Always. That's the short answer. Any service calling external dependencies needs circuit breakers. Any API facing external clients needs rate limiting. The combination creates defense in depth.

Consider this scenario. Your e-commerce service calls Stripe for payments and SendGrid for emails. Without circuit breakers, a Stripe outage means your checkout threads hang indefinitely. Timeouts help—but slow responses still exhaust connection pools.

Now add rate limiting. A marketing campaign drives 10x traffic to your signup endpoint. Rate limiting caps signups at a sustainable pace. Circuit breakers protect the downstream email service if SendGrid starts erroring under the load.

Pattern Protects Against Typical Placement Tools to Consider
Circuit Breaker Downstream failures, cascading outages Client-side (service calling dependency) Resilience4j (Java), opossum (Node.js), pybreaker (Python), gobreaker (Go)
Rate Limiting Traffic spikes, resource exhaustion Edge/API gateway or server-side Redis + Lua, AWS API Gateway, NGINX limit_req, envoy proxy

The catch? Both patterns add complexity. Circuit breakers need tuning—too sensitive and you trip unnecessarily, too lenient and you don't protect anything. Rate limits need baselining. Set them too low and you throttle legitimate users. Too high and they're useless.

How Do You Configure These Patterns Effectively?

Start with failure thresholds that match reality. If your dependency normally has 99.9% uptime, a circuit breaker set to trip at 3 failures in 10 seconds will trigger constantly. That's not resilience—it's self-sabotage.

Better approach: set thresholds based on your SLOs. If the upstream promises 99.5% success, allow for that 0.5% noise. A common starting point is 50% failure rate over 30 seconds. Adjust based on observed behavior.

Fallback strategies matter too. When the breaker opens, what happens? Options include:

  1. Fail fast: Return an error immediately. Clean, but user-facing.
  2. Cache response: Serve stale data. Good for read-heavy operations.
  3. Degraded mode: Disable non-critical features. The checkout works, but recommendations disappear.
  4. Queue for later: Accept the request, process asynchronously when dependency recovers.

For rate limits, use sliding windows in production. Fixed windows create thundering herd problems—everyone waits for the reset, then floods the API simultaneously. The AWS API Gateway uses token bucket by default. It's a solid choice.

Observability: You Can't Tune What You Can't See

Both patterns generate events worth tracking. Circuit breaker state transitions (closed → open → half-open) should log and emit metrics. Rate limit hits need visibility—are you throttling real users or blocking abuse?

"The goal isn't to prevent all failures. It's to fail gracefully, with predictable behavior, and recover automatically."

Dashboards help. Prometheus + Grafana can track breaker states. Rate limit metrics expose which clients hit limits most. Don't implement these patterns blindly—observe, tune, repeat.

Language-Specific Implementation Notes

In Go, the sony/gobreaker package offers a clean API. It handles the state machine internally. You provide a function; it wraps it. Configurable thresholds, custom fallbacks.

Node.js developers using Express find opossum integrates cleanly with async/await. It supports events—you can listen for 'open', 'halfOpen', and 'close' to trigger alerts or logging.

Python's ecosystem is more fragmented. pybreaker works for basic needs. For async code (asyncio), you'll need wrappers. Some teams implement custom solutions using Redis for distributed state—necessary when multiple instances need coordinated circuit breaking.

That said, don't over-engineer early. Start with in-memory breakers. Add distributed coordination only when you have multiple instances and need consistent behavior across them.

Common Mistakes That Undermine Resilience

Treating circuit breakers as pure failure detectors misses the point. They're about graceful degradation. If opening the breaker just throws 500s, you haven't solved the user experience problem—you've just failed faster.

Another trap: setting identical thresholds for all dependencies. Your payment provider deserves stricter monitoring than your analytics pipeline. One blocks revenue. The other... doesn't.

Rate limiting mistakes usually involve poor key selection. Limiting by IP address blocks legitimate users behind NAT (corporate networks, mobile carriers). Limiting by user ID misses unauthenticated attacks. Consider tiered approaches—stricter for anonymous, looser for authenticated, different tiers for paid vs. free.

Also—don't forget client-side rate limiting. Your service calling Stripe shouldn't blindly retry on 429s. Exponential backoff with jitter prevents thundering herds. The Stripe SDK handles this automatically. If you're building clients to internal services, implement similar courtesy.

Microservices don't fail occasionally. They fail constantly, in ways that compound. Circuit breakers and rate limiting don't eliminate these failures—they contain them, buying time and preserving user experience while systems heal. Start with one service, prove the patterns, then expand. The alternative—learning about cascading failures at 3 AM—is a lesson nobody wants.