AIMD Rate Limiting for API Clients
TCP congestion control applied to API rate limiting — additive increase on success, multiplicative decrease on errors. A control loop that finds the speed limit without being told.
The Problem
You’re hitting an API with unknown rate limits. Too fast and you get 429s or bans. Too slow and you waste time. The rate limit isn’t documented, varies by endpoint, and changes under load.
Fixed delays are wrong in both directions — too conservative when the API is healthy, too aggressive when it’s stressed.
The Idea: Borrow from TCP
TCP solved this in 1988. The network doesn’t tell you its capacity — you probe for it. AIMD (Additive Increase / Multiplicative Decrease) is the control loop:
- Additive increase: when things are going well, speed up by a small fixed step
- Multiplicative decrease: when you hit an error, slow down by a multiplier
The asymmetry is the key insight. You probe upward cautiously (linear) but retreat quickly (exponential). This converges to the maximum safe rate without oscillating wildly.
The Control Loop
every check_period (e.g., 5 seconds): err_pct = recent_errors / recent_total
if err_pct > target_err_pct: # too fast interval = interval * backoff_mul # multiplicative decrease elif err_pct < recover_pct: # headroom available interval = interval - speedup_step # additive increase (shorter = faster)
interval = clamp(interval, min_interval, max_interval)That’s it. The interval between requests goes up (slower) when errors spike, and ticks down (faster) when the path is clear.
Parameters
| Parameter | What it does | Example |
|---|---|---|
check_period | How often the controller evaluates | 5s |
target_err_pct | Error rate that triggers slowdown | 10% |
recover_pct | Error rate below which speedup begins | 5% |
backoff_mul | How much to slow down (multiplicative) | 1.5x |
speedup_step | How much to speed up (additive) | 200ms |
seed_interval | Starting point | 3000ms |
min_interval | Speed-up floor | 1500ms |
max_interval | Slow-down ceiling | 6000ms |
The dead zone between recover_pct and target_err_pct is intentional — if error rate is between 5% and 10%, hold steady. This prevents jitter.
In Practice
A real implementation from an API client managing a pool of upstream connections:
// Per-connection health trackingconst ( speedUpAfter = 20 // consecutive successes to try faster slowDownFactor = 0.80 // interval *= 1/0.80 = 1.25x slower speedUpFactor = 1.05 // interval *= 1/1.05 = ~5% faster minInterval = 1500 * time.Millisecond maxInterval = 6 * time.Second)
func updateHealth(node *connNode, success bool, err string) { if success { node.consecSuccesses++ if node.consecSuccesses >= speedUpAfter { cur := node.limiter.Interval() next := time.Duration(float64(cur) / speedUpFactor) next = max(next, minInterval) node.limiter.SetInterval(next) node.consecSuccesses = 0 } } else if err == "rate_limited" { cur := node.limiter.Interval() next := time.Duration(float64(cur) / slowDownFactor) next = min(next, maxInterval) node.limiter.SetInterval(next) }}Each connection in the pool has its own rate limiter and health state. 20 consecutive successes triggers a ~5% speedup. A rate-limit error triggers a 25% slowdown. The interval is clamped between 1.5s and 6s.
The result: the system finds the maximum safe rate per connection and holds there, automatically adapting if conditions change.
Why Not Just Use a Fixed Rate?
- Different endpoints have different throughput limits
- The server’s rate limit may change under load
- After an outage, you want to ramp back up gradually — not hammer at full speed
- New connections need to discover their limits
AIMD handles all of these automatically. The physicist’s version: it’s a first-order feedback loop with asymmetric gain. Fast negative feedback prevents damage; slow positive feedback finds the equilibrium.
Persistence Across Runs
The interval converges over the first few hundred queries. If you restart, you lose that convergence and burst through rate limits while re-learning.
Solution: persist the current interval to disk between runs.
// Save at end of runos.WriteFile("rate_interval_state", []byte(strconv.Itoa(ms)), 0644)
// Restore at startdata, _ := os.ReadFile("rate_interval_state")ms, _ := strconv.Atoi(string(data))This is the warm-start trick — the next run begins where the last one left off instead of probing from scratch.