Kindnote
Maturitybudding
Confidencemedium
Originai-assisted
Created
Tagsnetworking, rate-limiting, algorithms
Related
Markdown/note/aimd-rate-limiting.md
See what AI agents see
🤖 This content is AI-assisted. What does this mean?
note 🪴 budding 🤖 ai-assisted

AIMD Rate Limiting for API Clients

TCP congestion control applied to API rate limiting — additive increase on success, multiplicative decrease on errors. A control loop that finds the speed limit without being told.

The Problem

You’re hitting an API with unknown rate limits. Too fast and you get 429s or bans. Too slow and you waste time. The rate limit isn’t documented, varies by endpoint, and changes under load.

Fixed delays are wrong in both directions — too conservative when the API is healthy, too aggressive when it’s stressed.

The Idea: Borrow from TCP

TCP solved this in 1988. The network doesn’t tell you its capacity — you probe for it. AIMD (Additive Increase / Multiplicative Decrease) is the control loop:

  • Additive increase: when things are going well, speed up by a small fixed step
  • Multiplicative decrease: when you hit an error, slow down by a multiplier

The asymmetry is the key insight. You probe upward cautiously (linear) but retreat quickly (exponential). This converges to the maximum safe rate without oscillating wildly.

The Control Loop

every check_period (e.g., 5 seconds):
err_pct = recent_errors / recent_total
if err_pct > target_err_pct: # too fast
interval = interval * backoff_mul # multiplicative decrease
elif err_pct < recover_pct: # headroom available
interval = interval - speedup_step # additive increase (shorter = faster)
interval = clamp(interval, min_interval, max_interval)

That’s it. The interval between requests goes up (slower) when errors spike, and ticks down (faster) when the path is clear.

Parameters

ParameterWhat it doesExample
check_periodHow often the controller evaluates5s
target_err_pctError rate that triggers slowdown10%
recover_pctError rate below which speedup begins5%
backoff_mulHow much to slow down (multiplicative)1.5x
speedup_stepHow much to speed up (additive)200ms
seed_intervalStarting point3000ms
min_intervalSpeed-up floor1500ms
max_intervalSlow-down ceiling6000ms

The dead zone between recover_pct and target_err_pct is intentional — if error rate is between 5% and 10%, hold steady. This prevents jitter.

In Practice

A real implementation from an API client managing a pool of upstream connections:

// Per-connection health tracking
const (
speedUpAfter = 20 // consecutive successes to try faster
slowDownFactor = 0.80 // interval *= 1/0.80 = 1.25x slower
speedUpFactor = 1.05 // interval *= 1/1.05 = ~5% faster
minInterval = 1500 * time.Millisecond
maxInterval = 6 * time.Second
)
func updateHealth(node *connNode, success bool, err string) {
if success {
node.consecSuccesses++
if node.consecSuccesses >= speedUpAfter {
cur := node.limiter.Interval()
next := time.Duration(float64(cur) / speedUpFactor)
next = max(next, minInterval)
node.limiter.SetInterval(next)
node.consecSuccesses = 0
}
} else if err == "rate_limited" {
cur := node.limiter.Interval()
next := time.Duration(float64(cur) / slowDownFactor)
next = min(next, maxInterval)
node.limiter.SetInterval(next)
}
}

Each connection in the pool has its own rate limiter and health state. 20 consecutive successes triggers a ~5% speedup. A rate-limit error triggers a 25% slowdown. The interval is clamped between 1.5s and 6s.

The result: the system finds the maximum safe rate per connection and holds there, automatically adapting if conditions change.

Why Not Just Use a Fixed Rate?

  • Different endpoints have different throughput limits
  • The server’s rate limit may change under load
  • After an outage, you want to ramp back up gradually — not hammer at full speed
  • New connections need to discover their limits

AIMD handles all of these automatically. The physicist’s version: it’s a first-order feedback loop with asymmetric gain. Fast negative feedback prevents damage; slow positive feedback finds the equilibrium.

Persistence Across Runs

The interval converges over the first few hundred queries. If you restart, you lose that convergence and burst through rate limits while re-learning.

Solution: persist the current interval to disk between runs.

// Save at end of run
os.WriteFile("rate_interval_state", []byte(strconv.Itoa(ms)), 0644)
// Restore at start
data, _ := os.ReadFile("rate_interval_state")
ms, _ := strconv.Atoi(string(data))

This is the warm-start trick — the next run begins where the last one left off instead of probing from scratch.