Kind	`note`
Maturity	`budding`
Confidence	`medium`
Origin	`ai-assisted`
Created	March 28, 2026
Tags	networking, rate-limiting, algorithms
Related	`go-dns-scanner-4000qps`
Markdown	/note/aimd-rate-limiting.md

See what AI agents see

note 🪴 budding 🤖 ai-assisted

AIMD Rate Limiting for API Clients

TCP congestion control applied to API rate limiting — additive increase on success, multiplicative decrease on errors. A control loop that finds the speed limit without being told.

March 28, 2026

The Problem

You’re hitting an API with unknown rate limits. Too fast and you get 429s or bans. Too slow and you waste time. The rate limit isn’t documented, varies by endpoint, and changes under load.

Fixed delays are wrong in both directions — too conservative when the API is healthy, too aggressive when it’s stressed.

The Idea: Borrow from TCP

TCP solved this in 1988. The network doesn’t tell you its capacity — you probe for it. AIMD (Additive Increase / Multiplicative Decrease) is the control loop:

Additive increase: when things are going well, speed up by a small fixed step
Multiplicative decrease: when you hit an error, slow down by a multiplier

The asymmetry is the key insight. You probe upward cautiously (linear) but retreat quickly (exponential). This converges to the maximum safe rate without oscillating wildly.

The Control Loop

every check_period (e.g., 5 seconds):
    err_pct = recent_errors / recent_total

    if err_pct > target_err_pct:       # too fast
        interval = interval * backoff_mul    # multiplicative decrease
    elif err_pct < recover_pct:        # headroom available
        interval = interval - speedup_step   # additive increase (shorter = faster)

    interval = clamp(interval, min_interval, max_interval)

That’s it. The interval between requests goes up (slower) when errors spike, and ticks down (faster) when the path is clear.

Parameters

Parameter	What it does	Example
`check_period`	How often the controller evaluates	5s
`target_err_pct`	Error rate that triggers slowdown	10%
`recover_pct`	Error rate below which speedup begins	5%
`backoff_mul`	How much to slow down (multiplicative)	1.5x
`speedup_step`	How much to speed up (additive)	200ms
`seed_interval`	Starting point	3000ms
`min_interval`	Speed-up floor	1500ms
`max_interval`	Slow-down ceiling	6000ms

The dead zone between recover_pct and target_err_pct is intentional — if error rate is between 5% and 10%, hold steady. This prevents jitter.

In Practice

A real implementation from an API client managing a pool of upstream connections:

// Per-connection health tracking
const (
    speedUpAfter   = 20   // consecutive successes to try faster
    slowDownFactor = 0.80 // interval *= 1/0.80 = 1.25x slower
    speedUpFactor  = 1.05 // interval *= 1/1.05 = ~5% faster
    minInterval    = 1500 * time.Millisecond
    maxInterval    = 6 * time.Second
)

func updateHealth(node *connNode, success bool, err string) {
    if success {
        node.consecSuccesses++
        if node.consecSuccesses >= speedUpAfter {
            cur := node.limiter.Interval()
            next := time.Duration(float64(cur) / speedUpFactor)
            next = max(next, minInterval)
            node.limiter.SetInterval(next)
            node.consecSuccesses = 0
        }
    } else if err == "rate_limited" {
        cur := node.limiter.Interval()
        next := time.Duration(float64(cur) / slowDownFactor)
        next = min(next, maxInterval)
        node.limiter.SetInterval(next)
    }
}

Each connection in the pool has its own rate limiter and health state. 20 consecutive successes triggers a ~5% speedup. A rate-limit error triggers a 25% slowdown. The interval is clamped between 1.5s and 6s.

The result: the system finds the maximum safe rate per connection and holds there, automatically adapting if conditions change.

Why Not Just Use a Fixed Rate?

Different endpoints have different throughput limits
The server’s rate limit may change under load
After an outage, you want to ramp back up gradually — not hammer at full speed
New connections need to discover their limits

AIMD handles all of these automatically. The physicist’s version: it’s a first-order feedback loop with asymmetric gain. Fast negative feedback prevents damage; slow positive feedback finds the equilibrium.

Persistence Across Runs

The interval converges over the first few hundred queries. If you restart, you lose that convergence and burst through rate limits while re-learning.

Solution: persist the current interval to disk between runs.

// Save at end of run
os.WriteFile("rate_interval_state", []byte(strconv.Itoa(ms)), 0644)

// Restore at start
data, _ := os.ReadFile("rate_interval_state")
ms, _ := strconv.Atoi(string(data))

This is the warm-start trick — the next run begins where the last one left off instead of probing from scratch.