Markdown source

AIMD Rate Limiting for API Clients Markdown source

Readable source view for humans. The raw Markdown endpoint remains available for crawlers and agent readers.

---
title: "AIMD Rate Limiting for API Clients"
description: "TCP congestion control applied to API rate limiting — additive increase on success, multiplicative decrease on errors. Finds the limit automatically."
kind: note
maturity: budding
confidence: medium
origin: ai-assisted
author: "Agent"
directedBy: "krow"
tags: [networking, rate-limiting, algorithms]
published: 2026-03-28
modified: 2026-04-21
wordCount: 700
readingTime: 4
related: [go-dns-scanner-4000qps]
url: https://krowdev.com/note/aimd-rate-limiting/
---
## Agent Context

- Canonical: https://krowdev.com/note/aimd-rate-limiting/
- Markdown: https://krowdev.com/note/aimd-rate-limiting.md
- Full corpus: https://krowdev.com/llms-full.txt
- Kind: note
- Maturity: budding
- Confidence: medium
- Origin: ai-assisted
- Author: Agent
- Directed by: krow
- Published: 2026-03-28
- Modified: 2026-04-21
- Words: 700 (4 min read)
- Tags: networking, rate-limiting, algorithms
- Related: go-dns-scanner-4000qps
- Content map:
  - h2: The Problem
  - h2: The Idea: Borrow from TCP
  - h2: The Control Loop
  - h2: Parameters
  - h2: In Practice
  - h2: Why Not Just Use a Fixed Rate?
  - h2: Persistence Across Runs
  - h2: Sources
- Crawl policy: same canonical content is exposed through HTML, Markdown, and llms-full; no crawler-specific content gate.

## The Problem

You're hitting an API with unknown rate limits. Too fast and you get [429s](/snippet/http-status-codes/) or bans. Too slow and you waste time. The rate limit isn't documented, varies by endpoint, and changes under load.

Fixed delays are wrong in both directions — too conservative when the API is healthy, too aggressive when it's stressed.

## The Idea: Borrow from TCP

TCP solved this in 1988. The network doesn't tell you its capacity — you probe for it. AIMD (Additive Increase / Multiplicative Decrease) is the control loop:

- **Additive increase**: when things are going well, speed up by a small fixed step
- **Multiplicative decrease**: when you hit an error, slow down by a multiplier

The asymmetry is the key insight. You probe upward cautiously (linear) but retreat quickly (exponential). This converges to the maximum safe rate without oscillating wildly.

## The Control Loop

```
every check_period (e.g., 5 seconds):
    err_pct = recent_errors / recent_total

    if err_pct > target_err_pct:       # too fast
        interval = interval * backoff_mul    # multiplicative decrease
    elif err_pct < recover_pct:        # headroom available
        interval = interval - speedup_step   # additive increase (shorter = faster)

    interval = clamp(interval, min_interval, max_interval)
```

That's it. The interval between requests goes up (slower) when errors spike, and ticks down (faster) when the path is clear.

## Parameters

| Parameter | What it does | Example |
|-----------|-------------|---------|
| `check_period` | How often the controller evaluates | 5s |
| `target_err_pct` | Error rate that triggers slowdown | 10% |
| `recover_pct` | Error rate below which speedup begins | 5% |
| `backoff_mul` | How much to slow down (multiplicative) | 1.5x |
| `speedup_step` | How much to speed up (additive) | 200ms |
| `seed_interval` | Starting point | 3000ms |
| `min_interval` | Speed-up floor | 1500ms |
| `max_interval` | Slow-down ceiling | 6000ms |

The dead zone between `recover_pct` and `target_err_pct` is intentional — if error rate is between 5% and 10%, hold steady. This prevents jitter.

## In Practice

A real implementation from the [Go DNS scanner](/article/go-dns-scanner-4000qps/) managing a pool of upstream connections:

```go
// Per-connection health tracking
const (
    speedUpAfter   = 20   // consecutive successes to try faster
    slowDownFactor = 0.80 // interval *= 1/0.80 = 1.25x slower
    speedUpFactor  = 1.05 // interval *= 1/1.05 = ~5% faster
    minInterval    = 1500 * time.Millisecond
    maxInterval    = 6 * time.Second
)

func updateHealth(node *connNode, success bool, err string) {
    if success {
        node.consecSuccesses++
        if node.consecSuccesses >= speedUpAfter {
            cur := node.limiter.Interval()
            next := time.Duration(float64(cur) / speedUpFactor)
            next = max(next, minInterval)
            node.limiter.SetInterval(next)
            node.consecSuccesses = 0
        }
    } else if err == "rate_limited" {
        cur := node.limiter.Interval()
        next := time.Duration(float64(cur) / slowDownFactor)
        next = min(next, maxInterval)
        node.limiter.SetInterval(next)
    }
}
```

Each connection in the pool has its own rate limiter and health state. 20 consecutive successes triggers a ~5% speedup. A rate-limit error triggers a 25% slowdown. The interval is clamped between 1.5s and 6s.

The result: the system finds the maximum safe rate per connection and holds there, automatically adapting if conditions change.

## Why Not Just Use a Fixed Rate?

- Different endpoints have different throughput limits
- The server's rate limit may change under load
- After an outage, you want to ramp back up gradually — not hammer at full speed
- New connections need to discover their limits

AIMD handles all of these automatically. The physicist's version: it's a first-order feedback loop with asymmetric gain. Fast negative feedback prevents damage; slow positive feedback finds the equilibrium.

## Persistence Across Runs

The interval converges over the first few hundred queries. If you restart, you lose that convergence and burst through rate limits while re-learning.

Solution: persist the current interval to disk between runs.

```go
// Save at end of run
os.WriteFile("rate_interval_state", []byte(strconv.Itoa(ms)), 0644)

// Restore at start
data, _ := os.ReadFile("rate_interval_state")
ms, _ := strconv.Atoi(string(data))
```

This is the warm-start trick — the next run begins where the last one left off instead of probing from scratch.

For the concurrency side of the same problem, pair this with [worker pool isolation](/snippet/worker-pool-isolation/) and [pipeline stage communication](/snippet/pipeline-stage-communication/).

## Sources

- IETF, [RFC 5681: TCP Congestion Control](https://datatracker.ietf.org/doc/html/rfc5681)
- MDN, [429 Too Many Requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429)