Markdown source
AIMD Rate Limiting for API Clients Markdown source
Readable source view for humans. The raw Markdown endpoint remains available for crawlers and agent readers.
---
title: "AIMD Rate Limiting for API Clients"
description: "TCP congestion control applied to API rate limiting — additive increase on success, multiplicative decrease on errors. Finds the limit automatically."
kind: note
maturity: budding
confidence: medium
origin: ai-assisted
author: "Agent"
directedBy: "krow"
tags: [networking, rate-limiting, algorithms]
published: 2026-03-28
modified: 2026-04-21
wordCount: 700
readingTime: 4
related: [go-dns-scanner-4000qps]
url: https://krowdev.com/note/aimd-rate-limiting/
---
## Agent Context
- Canonical: https://krowdev.com/note/aimd-rate-limiting/
- Markdown: https://krowdev.com/note/aimd-rate-limiting.md
- Full corpus: https://krowdev.com/llms-full.txt
- Kind: note
- Maturity: budding
- Confidence: medium
- Origin: ai-assisted
- Author: Agent
- Directed by: krow
- Published: 2026-03-28
- Modified: 2026-04-21
- Words: 700 (4 min read)
- Tags: networking, rate-limiting, algorithms
- Related: go-dns-scanner-4000qps
- Content map:
- h2: The Problem
- h2: The Idea: Borrow from TCP
- h2: The Control Loop
- h2: Parameters
- h2: In Practice
- h2: Why Not Just Use a Fixed Rate?
- h2: Persistence Across Runs
- h2: Sources
- Crawl policy: same canonical content is exposed through HTML, Markdown, and llms-full; no crawler-specific content gate.
## The Problem
You're hitting an API with unknown rate limits. Too fast and you get [429s](/snippet/http-status-codes/) or bans. Too slow and you waste time. The rate limit isn't documented, varies by endpoint, and changes under load.
Fixed delays are wrong in both directions — too conservative when the API is healthy, too aggressive when it's stressed.
## The Idea: Borrow from TCP
TCP solved this in 1988. The network doesn't tell you its capacity — you probe for it. AIMD (Additive Increase / Multiplicative Decrease) is the control loop:
- **Additive increase**: when things are going well, speed up by a small fixed step
- **Multiplicative decrease**: when you hit an error, slow down by a multiplier
The asymmetry is the key insight. You probe upward cautiously (linear) but retreat quickly (exponential). This converges to the maximum safe rate without oscillating wildly.
## The Control Loop
```
every check_period (e.g., 5 seconds):
err_pct = recent_errors / recent_total
if err_pct > target_err_pct: # too fast
interval = interval * backoff_mul # multiplicative decrease
elif err_pct < recover_pct: # headroom available
interval = interval - speedup_step # additive increase (shorter = faster)
interval = clamp(interval, min_interval, max_interval)
```
That's it. The interval between requests goes up (slower) when errors spike, and ticks down (faster) when the path is clear.
## Parameters
| Parameter | What it does | Example |
|-----------|-------------|---------|
| `check_period` | How often the controller evaluates | 5s |
| `target_err_pct` | Error rate that triggers slowdown | 10% |
| `recover_pct` | Error rate below which speedup begins | 5% |
| `backoff_mul` | How much to slow down (multiplicative) | 1.5x |
| `speedup_step` | How much to speed up (additive) | 200ms |
| `seed_interval` | Starting point | 3000ms |
| `min_interval` | Speed-up floor | 1500ms |
| `max_interval` | Slow-down ceiling | 6000ms |
The dead zone between `recover_pct` and `target_err_pct` is intentional — if error rate is between 5% and 10%, hold steady. This prevents jitter.
## In Practice
A real implementation from the [Go DNS scanner](/article/go-dns-scanner-4000qps/) managing a pool of upstream connections:
```go
// Per-connection health tracking
const (
speedUpAfter = 20 // consecutive successes to try faster
slowDownFactor = 0.80 // interval *= 1/0.80 = 1.25x slower
speedUpFactor = 1.05 // interval *= 1/1.05 = ~5% faster
minInterval = 1500 * time.Millisecond
maxInterval = 6 * time.Second
)
func updateHealth(node *connNode, success bool, err string) {
if success {
node.consecSuccesses++
if node.consecSuccesses >= speedUpAfter {
cur := node.limiter.Interval()
next := time.Duration(float64(cur) / speedUpFactor)
next = max(next, minInterval)
node.limiter.SetInterval(next)
node.consecSuccesses = 0
}
} else if err == "rate_limited" {
cur := node.limiter.Interval()
next := time.Duration(float64(cur) / slowDownFactor)
next = min(next, maxInterval)
node.limiter.SetInterval(next)
}
}
```
Each connection in the pool has its own rate limiter and health state. 20 consecutive successes triggers a ~5% speedup. A rate-limit error triggers a 25% slowdown. The interval is clamped between 1.5s and 6s.
The result: the system finds the maximum safe rate per connection and holds there, automatically adapting if conditions change.
## Why Not Just Use a Fixed Rate?
- Different endpoints have different throughput limits
- The server's rate limit may change under load
- After an outage, you want to ramp back up gradually — not hammer at full speed
- New connections need to discover their limits
AIMD handles all of these automatically. The physicist's version: it's a first-order feedback loop with asymmetric gain. Fast negative feedback prevents damage; slow positive feedback finds the equilibrium.
## Persistence Across Runs
The interval converges over the first few hundred queries. If you restart, you lose that convergence and burst through rate limits while re-learning.
Solution: persist the current interval to disk between runs.
```go
// Save at end of run
os.WriteFile("rate_interval_state", []byte(strconv.Itoa(ms)), 0644)
// Restore at start
data, _ := os.ReadFile("rate_interval_state")
ms, _ := strconv.Atoi(string(data))
```
This is the warm-start trick — the next run begins where the last one left off instead of probing from scratch.
For the concurrency side of the same problem, pair this with [worker pool isolation](/snippet/worker-pool-isolation/) and [pipeline stage communication](/snippet/pipeline-stage-communication/).
## Sources
- IETF, [RFC 5681: TCP Congestion Control](https://datatracker.ietf.org/doc/html/rfc5681)
- MDN, [429 Too Many Requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429)