Leaky Bucket and Token Bucket are two rate-limiting algorithms used in network systems, APIs, and distributed applications. While both control request rates, they differ in how they handle traffic bursts and regulate flow.

Leaky Bucket

Leaky Bucket models a bucket that leaks at a constant rate. Incoming requests (or packets) fill the bucket, but they are processed (leaked) at a fixed rate. If the bucket is full, excess requests are discarded, preventing system overload.

  • Strict output rate: Requests are processed at a constant speed.
  • No accumulation: If the bucket is empty, new requests must wait.
  • Excess requests are dropped: If too many arrive too quickly, they are lost.

Python Implementation of Leaky Bucket

import time
from collections import deque
 
class LeakyBucket:
    def __init__(self, capacity, leak_rate):
        self.capacity = capacity  # Maximum bucket size
        self.leak_rate = leak_rate  # Requests processed per second
        self.queue = deque()
        self.last_checked = time.time()
 
    def _leak(self):
        now = time.time()
        elapsed_time = now - self.last_checked
        leak_amount = int(elapsed_time * self.leak_rate)
        self.last_checked = now
 
        for _ in range(leak_amount):
            if self.queue:
                self.queue.popleft()
 
    def add_request(self, request_id):
        self._leak()
        if len(self.queue) < self.capacity:
            self.queue.append(request_id)
            print(f"Request {request_id} accepted")
        else:
            print(f"Request {request_id} rejected - bucket full")
 
# Example usage
bucket = LeakyBucket(capacity=5, leak_rate=1)
for i in range(7):
    bucket.add_request(i)
    time.sleep(0.5)

Token Bucket

Token Bucket refills a bucket with tokens at a fixed rate. Each request consumes a token, and requests are only processed if tokens are available.

  • Allows bursts: If the bucket has accumulated tokens, multiple requests can be processed instantly.
  • Requests must wait: If the bucket is empty, new requests must wait for tokens to refill.
  • Unused tokens accumulate: Unlike Leaky Bucket, which strictly enforces output rate, Token Bucket allows temporary bursts.

Python Implementation of Token Bucket

import time
 
class TokenBucket:
    def __init__(self, capacity, refill_rate):
        self.capacity = capacity  # Maximum bucket size
        self.refill_rate = refill_rate  # Tokens added per second
        self.tokens = capacity  # Start full
        self.last_checked = time.time()
 
    def _refill(self):
        now = time.time()
        elapsed_time = now - self.last_checked
        self.tokens = min(self.capacity, self.tokens + elapsed_time * self.refill_rate)
        self.last_checked = now
 
    def allow_request(self):
        self._refill()
        if self.tokens >= 1:
            self.tokens -= 1
            print("Request allowed")
            return True
        else:
            print("Request denied - not enough tokens")
            return False
 
# Example usage
bucket = TokenBucket(capacity=5, refill_rate=2)
for _ in range(7):
    bucket.allow_request()
    time.sleep(0.5)

Key Differences Between Leaky Bucket and Token Bucket

FeatureLeaky BucketToken Bucket
Rate EnforcementStrict, fixed output rateAllows bursts up to bucket size
Request HandlingExcess requests are droppedRequests must wait for tokens
Behavior Under Low TrafficNo accumulation of capacityAccumulates unused tokens
Use CaseNetwork traffic shaping (QoS)API rate limiting with bursts

Real-World Applications

Both algorithms are widely used in network and distributed systems:

  • Leaky Bucket is preferred for traffic shaping, ensuring a smooth and predictable flow (e.g., regulating network packet transmission).
  • Token Bucket is commonly used for API rate limiting, where occasional bursts of traffic should be allowed while maintaining a long-term average rate.

By ensuring a controlled flow, these algorithms maintain system stability, preventing overload while optimizing request processing.