← Back to curriculum

Module 10 — Production & scaling

Rate limiting & guardrails

Token buckets, per-user limits, 429 responses, abuse prevention, and protecting upstream provider quotas.

~60 min read + exercises

Rate limiting & guardrails

Before we begin

One abusive client or runaway agent loop can burn through your LLM budget in minutes. Rate limiting is non-negotiable in production.

Figure

Rate limit gate

Clientsusers / botsRate limit429 if overLLM API$ per token
Every request passes through limits before hitting the paid LLM API.

What you will learn

  • Limit by user, IP, or API key.
  • Return proper 429 responses with Retry-After.
  • Combine limits with agent iteration caps from Module 8.

Before this lesson


What to limit

DimensionExample
Requests / minute20 chat messages per user
Tokens / day100k input tokens per API key
Agent stepsMax 10 tool calls per session
Concurrent streams3 open SSE connections

Pick limits from expected UX and worst-case cost.


Token bucket (concept)

Bucket holds burst capacity. Tokens refill at steady rate. Request consumes one token; empty bucket → reject.

Good for chat bursts without allowing sustained abuse.


Simple Redis counter

typescript
async function rateLimit(userId: string, limit = 30, windowSec = 60) {
  const key = `rl:${userId}:${Math.floor(Date.now() / 1000 / windowSec)}`;
  const count = await redis.incr(key);
  if (count === 1) await redis.expire(key, windowSec);
  if (count > limit) {
    return { ok: false, retryAfterSec: windowSec };
  }
  return { ok: true };
}

Return 429 with JSON body and Retry-After header.


Guardrails beyond rate limits

  • Input length cap — reject prompts over N tokens before LLM call.
  • Output max_tokens — bound completion cost.
  • Tool allowlist — agents only call approved functions.
  • Auth — anonymous tiers get lower limits than signed-in users.

Checkpoint

You should articulate why rate limiting protects cost and provider quotas, not just “security theater.”


What's next

Lesson 4 — Cost optimization & token budgets