Concept

Five Rate Limiting Algorithms

Why Rate Limiting is Essential

  • Resource protection: Prevents any single client from monopolizing compute resources.
  • Abuse and security: The primary defense against brute-force login attacks, credential stuffing, and content scraping.
  • Cost control: For services that call expensive downstream APIs, rate limiting prevents runaway spend.
  • Fair usage: In multi-tenant systems, ensures one large customer cannot degrade service for everyone else.

The Five Algorithms (state almost always stored in Redis)

1. Fixed Window Counter — Define a time window (e.g., one minute). Increment a counter per user per window. Reject requests over the limit. Reset at the window boundary.
Problem: Allows edge bursts — a user can make the full limit at 11:59:59 and again at 12:00:01, sending double the allowed volume in two seconds.

2. Sliding Window Log — Store a timestamped log of every request per user. On each request, remove entries older than the window and check the log size against the limit.
Pros: Perfectly accurate. Cons: Stores one entry per request — high memory usage at scale.

3. Sliding Window Counter — A practical compromise. Estimate the current rate using a weighted blend of the previous window's count and the current window's count. Memory-efficient; good enough accuracy for most systems.

4. Token Bucket — A bucket of fixed capacity fills with tokens at a steady rate. Each request consumes one token. If the bucket is empty, the request is rejected. Allows controlled bursts (up to bucket capacity) while enforcing a long-term average rate.
Pros: Memory-efficient (one counter and timestamp per user); handles bursts naturally. The most widely used algorithm for public APIs.

5. Leaky Bucket — Requests enter a queue (the bucket). The queue drains at a fixed, constant rate. If the queue is full, new requests are rejected.
Key difference from Token Bucket: Leaky Bucket enforces a perfectly steady outflow. Use Leaky Bucket when you need to protect a downstream system from any burst, not just sustained overload.