Rate Limiting for AI Agents: Preventing Runaway Costs

We've all heard the horror stories: an AI agent loops infinitely, making thousands of API calls, racking up a $10,000 bill overnight. This is one of the most common AI agent mistakes — and one of the most preventable.

Rate limiting is your safety net.

Types of Rate Limits

Type	Example	Use Case
Per-minute	100 calls/min	Prevent burst abuse
Per-hour	1,000 calls/hr	Sustained load control
Per-day	10,000 calls/day	Budget management
Per-action	10 emails/hr	Specific action limits
Cost-based	$50/day	Direct cost control

Implementing Rate Limits

from agentshield import AgentShield shield = AgentShield(api_key="...") # Configure rate limits shield.configure_limits({ "email.send": {"per_hour": 10, "per_day": 50}, "api.call": {"per_minute": 100, "per_day": 5000}, "payments.send": {"per_day": 5, "max_amount": 1000} }) @shield.protect(scope="email.send") def send_email(to, subject, body): # Automatically rate limited pass

What Happens When Limit Hit?

{ "allowed": false, "reason": "rate_limit_exceeded", "rate_limit_remaining": 0, "rate_limit_reset": 3600, // seconds until reset "message": "Limit of 10 emails/hour exceeded" }

Cost-Based Limits

Instead of counting actions, limit by cost. This approach fits well into an enterprise governance framework where budget control is essential:

shield.configure_limits({ "openai.completion": { "cost_per_day": 50.00, # $50/day max "cost_per_call": 0.10 # $0.10 per call estimate } })

Gradual Backoff

Instead of hard blocks, implement gradual slowdown:

0-50% of limit: Normal speed
50-80% of limit: Add 1s delay
80-95% of limit: Add 5s delay
95-100%: Add 30s delay + warning
>100%: Block + alert

Alerting

Get notified before you hit limits. For fully autonomous systems like AutoGPT, alerts are critical since there's no human in the loop:

shield.configure_alerts({ "email.send": { "warn_at": 0.8, # 80% of limit "notify": ["slack", "email"] } })

Protect your budget with rate limiting

Start Free →