Skip to main content

FastAPI Rate Limiting

Introduction

Rate limiting is a critical security feature that restricts the number of requests a client can make to your API within a specified time period. Implementing rate limiting in your FastAPI application helps protect against abuse, denial-of-service attacks, and ensures fair usage of your resources.

In this tutorial, we'll explore how to implement rate limiting in FastAPI applications, understand the underlying concepts, and see practical examples that you can apply to your own projects.

Why Rate Limiting Matters

Before diving into implementation, let's understand why rate limiting is essential:

  • Prevent Abuse: Stops malicious users from overwhelming your server with requests
  • Resource Management: Ensures fair distribution of server resources among all users
  • Cost Control: Helps manage API usage costs, especially for third-party services
  • Performance Stability: Maintains consistent API performance during traffic spikes

Rate Limiting Concepts

Rate limiting involves several key concepts:

  1. Rate Limit: Maximum number of requests allowed within a time window
  2. Time Window: Period in which requests are counted (e.g., per minute, hour)
  3. Client Identification: How to identify clients (IP address, API key, user ID)
  4. Rate Limit Headers: HTTP headers that inform clients about limits and remaining requests
  5. Response Strategy: How your API responds when limits are exceeded

Implementing Rate Limiting in FastAPI

FastAPI doesn't include built-in rate limiting, but we can add it using middleware or dependencies. We'll explore two popular approaches:

Approach 1: Using the slowapi Package

The slowapi package is built on top of the popular Flask extension flask-limiter and provides an elegant way to implement rate limiting in FastAPI.

Step 1: Install required packages

bash
pip install slowapi

Step 2: Set up the rate limiter

python
from fastapi import FastAPI, Request
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

# Create a limiter instance with a default client identifier
limiter = Limiter(key_func=get_remote_address)

app = FastAPI()

# Add rate limiter to the application
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

Step 3: Apply rate limits to endpoints

python
@app.get("/basic-endpoint")
@limiter.limit("5/minute") # Allow 5 requests per minute per client
async def basic_rate_limited_endpoint():
return {"message": "This endpoint is rate limited to 5 requests per minute"}

@app.get("/custom-endpoint")
@limiter.limit("2/5seconds") # Allow 2 requests per 5 seconds
async def custom_rate_limited_endpoint():
return {"message": "This endpoint is rate limited to 2 requests per 5 seconds"}

Step 4: Try the endpoint

When you make more than 5 requests within a minute to the /basic-endpoint, you'll receive a 429 Too Many Requests response with a message indicating when you can try again.

Output example after exceeding rate limit:

json
{
"detail": "Rate limit exceeded: 5 per 1 minute"
}

Approach 2: Custom Rate Limiting with Redis

For production applications, you might want a more scalable solution using Redis to track request counts across multiple application instances.

Step 1: Install required packages

bash
pip install redis fastapi-limiter

Step 2: Set up Redis and configure the limiter

python
import redis
import asyncio
from fastapi import FastAPI, Request, Depends
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter

app = FastAPI()

@app.on_event("startup")
async def startup():
# Connect to Redis instance
redis_instance = redis.Redis(host="localhost", port=6379, db=0, decode_responses=True)
# Initialize the limiter
await FastAPILimiter.init(redis_instance)

# Apply rate limiting to specific endpoints
@app.get("/redis-limited")
@app.dependency(Depends(RateLimiter(times=5, seconds=60)))
async def redis_rate_limited_endpoint():
return {"message": "This endpoint is rate limited with Redis"}

Approach 3: Custom Middleware Rate Limiting

For simple applications or when you need complete control, you can implement your own rate limiting middleware.

python
from fastapi import FastAPI, Request, HTTPException, status
from fastapi.responses import JSONResponse
import time
from collections import defaultdict

app = FastAPI()

# In-memory storage for request counts (not suitable for production)
request_counts = defaultdict(list)

@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
# Identify client by IP address
client_ip = request.client.host
current_time = time.time()

# Remove requests older than 1 minute
request_counts[client_ip] = [timestamp for timestamp in request_counts[client_ip]
if current_time - timestamp < 60]

# Check if client has exceeded rate limit (10 requests per minute)
if len(request_counts[client_ip]) >= 10:
return JSONResponse(
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
content={"detail": "Rate limit exceeded: 10 requests per minute allowed"},
headers={"Retry-After": "60"}
)

# Add current request timestamp
request_counts[client_ip].append(current_time)

# Process the request
response = await call_next(request)
return response

@app.get("/custom-middleware-limited")
async def custom_middleware_rate_limited():
return {"message": "This endpoint is protected by custom rate limiting middleware"}

Advanced Rate Limiting Strategies

Different Limits for Different Endpoints

With slowapi, you can apply different limits to different endpoints based on their requirements:

python
@app.get("/public-endpoint")
@limiter.limit("5/minute")
async def public_endpoint():
return {"message": "This endpoint has a lower rate limit"}

@app.get("/premium-endpoint")
@limiter.limit("100/minute")
async def premium_endpoint():
return {"message": "This endpoint has a higher rate limit for premium users"}

Dynamic Rate Limiting Based on User Type

You can also apply dynamic rate limits based on user authentication or subscription level:

python
def get_user_tier(request: Request):
# This would typically check authentication or user data
# For this example we'll use a header
tier = request.headers.get("X-User-Tier", "free")
return tier

@app.get("/dynamic-limit")
@limiter.limit(lambda _: "100/minute"
if get_user_tier(_) == "premium"
else "5/minute")
async def dynamic_rate_limited_endpoint(request: Request):
user_tier = get_user_tier(request)
return {
"message": f"You're on the {user_tier} tier",
"limit": "100/minute" if user_tier == "premium" else "5/minute"
}

Adding Rate Limit Headers

To make your API more user-friendly, include rate limit headers in responses:

python
from fastapi import FastAPI, Request, Response

app = FastAPI()

@app.middleware("http")
async def add_rate_limit_headers(request: Request, call_next):
response = await call_next(request)

# Add custom rate limit headers
response.headers["X-RateLimit-Limit"] = "100"
response.headers["X-RateLimit-Remaining"] = "95" # In a real app, this would be calculated
response.headers["X-RateLimit-Reset"] = "60"

return response

Real-World Example: Protecting a User Authentication API

Here's a practical example that combines rate limiting with a user login endpoint:

python
from fastapi import FastAPI, Depends, HTTPException, status
from fastapi.security import OAuth2PasswordRequestForm
from slowapi import Limiter
from slowapi.util import get_remote_address

app = FastAPI()
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter

# Mock user database
fake_users_db = {
"johndoe": {
"username": "johndoe",
"password": "secret",
}
}

@app.post("/login")
@limiter.limit("5/minute") # Strict limit to prevent brute force attacks
async def login(form_data: OAuth2PasswordRequestForm = Depends()):
user = fake_users_db.get(form_data.username)
if not user or form_data.password != user["password"]:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect username or password",
)

return {"access_token": "example_token", "token_type": "bearer"}

This example shows how rate limiting can protect a login endpoint from brute force attacks by limiting the number of login attempts from a single IP address.

Best Practices for Rate Limiting

To effectively implement rate limiting in your FastAPI applications:

  1. Use Appropriate Client Identifiers: Consider using API keys or authentication tokens instead of just IP addresses
  2. Set Reasonable Limits: Balance security with user experience
  3. Communicate Limits Clearly: Use response headers to inform clients about limits and remaining requests
  4. Implement Graceful Degradation: When limits are reached, provide helpful error messages
  5. Monitor and Adjust: Regularly review rate limit effectiveness and adjust as needed
  6. Use Distributed Storage: For production systems, use Redis or other distributed storage for tracking limits across multiple instances

Common Mistakes to Avoid

  • Too Strict Limits: Setting limits too low can frustrate legitimate users
  • Relying Solely on IP Address: IP-based limiting can affect multiple users behind NATs or proxies
  • Not Handling Rate Limit Headers: Failing to inform clients about limits through headers
  • In-Memory Tracking in Production: Using in-memory solutions that won't work across multiple application instances

Summary

Rate limiting is an essential security feature for any API. In this tutorial, we've explored:

  • Why rate limiting is important for API security and stability
  • How to implement rate limiting in FastAPI using different approaches
  • Advanced strategies for dynamic rate limiting
  • Best practices and common pitfalls

By implementing rate limiting in your FastAPI applications, you protect your services from abuse while ensuring fair access for all users.

Additional Resources and Exercises

Resources

Exercises

  1. Basic Rate Limiting: Implement a simple rate limiter using slowapi and test it with different limits
  2. Redis Integration: Set up rate limiting with Redis backend and test its behavior across multiple application instances
  3. Custom Headers: Extend the middleware example to calculate and include accurate rate limit headers
  4. Tiered Access: Create an API with different endpoints having different rate limits based on authentication status
  5. Rate Limit Dashboard: Build a simple admin dashboard that shows current rate limit status for different clients

Remember that rate limiting is just one part of a comprehensive API security strategy, and should be combined with other measures like authentication, input validation, and proper error handling.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)