FastAPI Production Server

When moving your FastAPI application from development to production, you need to consider using a proper ASGI server configuration. FastAPI's built-in development server (uvicorn) isn't designed to handle production workloads efficiently. In this guide, we'll explore how to set up a production-grade server for your FastAPI applications.

Introduction to Production Servers

During development, running FastAPI with a simple uvicorn main:app --reload command works fine. However, this setup lacks features necessary for production environments:

Limited concurrency
No automatic restarts after crashes
Inefficient use of system resources
No load balancing

Production servers solve these issues by providing:

Worker management
Process monitoring
Better resource utilization
Enhanced security
Load balancing

Let's explore the main options for deploying FastAPI in production.

Uvicorn in Production Mode

While Uvicorn is often used during development, it can also serve as a production server with the right configuration.

Basic Production Configuration

python
# Without the --reload flag
uvicorn main:app --host 0.0.0.0 --port 8000

Using Multiple Workers

python
# Run with 4 worker processes
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

The number of workers typically should match your CPU cores. A common formula is:

workers = (2 * num_cores) + 1

Advanced Uvicorn Configuration

python
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 --log-level warning --no-access-log --limit-concurrency 1000

This command:

Runs 4 worker processes
Sets the log level to warning
Disables access logs (reduces I/O)
Limits concurrent connections to 1000

Gunicorn with Uvicorn Workers

Gunicorn (Green Unicorn) is a mature WSGI server that can manage Uvicorn workers, offering better process management.

Installation

bash
pip install gunicorn uvicorn

Basic Configuration

bash
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000

Where:

-w 4: Run 4 worker processes
-k uvicorn.workers.UvicornWorker: Use Uvicorn's worker class
-b 0.0.0.0:8000: Bind to all interfaces on port 8000

Using a Configuration File

For more complex configurations, create a gunicorn.conf.py file:

python
# gunicorn.conf.py
import multiprocessing

# Server socket
bind = "0.0.0.0:8000"

# Worker processes
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"

# Server mechanics
daemon = False
pidfile = "/tmp/gunicorn.pid"

# Logging
accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
loglevel = "info"

# Process naming
proc_name = "fastapi_app"

# Maximum number of simultaneous clients
worker_connections = 1000

# Timeout
timeout = 30

# Maximum requests
max_requests = 10000
max_requests_jitter = 1000

Run Gunicorn with this configuration file:

bash
gunicorn -c gunicorn.conf.py main:app

Hypercorn for ASGI and HTTP/2 Support

Hypercorn is another ASGI server that supports HTTP/2 and WebSockets.

Installation

bash
pip install hypercorn

Basic Usage

bash
hypercorn main:app --bind 0.0.0.0:8000 --workers 4

Configuration File

Create a hypercorn_config.py file:

python
# hypercorn_config.py
bind = ["0.0.0.0:8000"]
workers = 4
worker_class = "asyncio"
keep_alive_timeout = 65
graceful_timeout = 30
accesslog = "-"  # Log to stdout
errorlog = "-"   # Log to stdout
loglevel = "INFO"

Run with:

bash
hypercorn -c hypercorn_config.py main:app

Docker Deployment

For containerized deployments, you can use Docker to package your FastAPI application with a production server.

Sample Dockerfile

dockerfile
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Run using Gunicorn with Uvicorn workers
CMD ["gunicorn", "main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8000"]

EXPOSE 8000

Build and run the Docker container:

bash
docker build -t fastapi-app .
docker run -p 8000:8000 fastapi-app

Real-World Example: Complete FastAPI Application with Production Setup

Let's create a simple FastAPI application and prepare it for production deployment:

Project Structure

my_fastapi_app/
├── app/
│   ├── __init__.py
│   ├── main.py
│   ├── models.py
│   ├── routers/
│   └── dependencies.py
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── gunicorn.conf.py

Application Code (app/main.py)

python
from fastapi import FastAPI
from app.routers import items, users
import uvicorn

app = FastAPI(
    title="MyAPI",
    description="A production-ready FastAPI application",
    version="1.0.0",
)

# Include routers
app.include_router(items.router)
app.include_router(users.router)

@app.get("/")
async def root():
    return {"message": "Welcome to the API"}

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

if __name__ == "__main__":
    # For development only
    uvicorn.run("app.main:app", host="0.0.0.0", port=8000, reload=True)

Sample Router (app/routers/items.py)

python
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel

router = APIRouter(
    prefix="/items",
    tags=["items"],
)

class Item(BaseModel):
    id: int
    name: str
    description: str = None
    price: float

items_db = {}

@router.get("/")
async def read_items():
    return items_db

@router.get("/{item_id}")
async def read_item(item_id: int):
    if item_id not in items_db:
        raise HTTPException(status_code=404, detail="Item not found")
    return items_db[item_id]

@router.post("/")
async def create_item(item: Item):
    items_db[item.id] = item
    return item

Docker Compose Setup (docker-compose.yml)

yaml
version: '3'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - ./logs:/var/log/gunicorn
    restart: always
    environment:
      - ENV=production

Production Start Script (start.sh)

bash
#!/bin/bash

# Create log directories
mkdir -p /var/log/gunicorn

# Run with Gunicorn
exec gunicorn -c gunicorn.conf.py app.main:app

Make the script executable:

bash
chmod +x start.sh

Monitoring and Health Checks

It's important to implement health checks to monitor your FastAPI application in production:

Health Check Endpoint

python
@app.get("/health")
async def health_check():
    # You can add database connection checks or other service checks here
    return {
        "status": "healthy",
        "version": "1.0.0",
        "timestamp": datetime.now().isoformat()
    }

External Monitoring

You can use services like:

Prometheus for metrics
Grafana for visualization
Sentry for error tracking
Datadog or New Relic for application performance monitoring

Performance Optimization Tips

Use async efficiently: Make sure I/O-bound operations are properly awaited
Connection pooling: Use connection pools for databases
Caching: Implement Redis or other caching mechanisms
Database optimization: Use proper indexes and optimize queries
Use background tasks: For processing heavy operations
Limit request body size: Prevent excessive memory usage
Implement rate limiting: Protect against DoS attacks

Scaling Strategies

Horizontal Scaling

Deploy multiple instances behind a load balancer:

[Client] → [Load Balancer] → [FastAPI Instance 1]
                           → [FastAPI Instance 2]
                           → [FastAPI Instance 3]

Vertical Scaling

Increase resources (CPU, memory) for your server.

Summary

Setting up a production server for FastAPI involves:

Choosing the right ASGI server (Uvicorn, Gunicorn with Uvicorn workers, or Hypercorn)
Configuring appropriate worker counts and connection limits
Setting up logging and monitoring
Using containers for deployment consistency
Implementing health checks and performance optimizations

By following these best practices, your FastAPI application will be well-equipped to handle production traffic with improved reliability, performance, and scalability.

Additional Resources

Exercises

Set up a FastAPI application with Gunicorn and Uvicorn workers using a configuration file.
Create a Docker container for your FastAPI application with a production server.
Implement a comprehensive health check endpoint that checks database connectivity.
Configure logging to rotate log files daily and archive them after a week.
Implement a load test to determine the optimal number of workers for your application.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction to Production Servers​

Uvicorn in Production Mode​

Basic Production Configuration​

Using Multiple Workers​

Advanced Uvicorn Configuration​

Gunicorn with Uvicorn Workers​

Installation​

Basic Configuration​

Using a Configuration File​

Hypercorn for ASGI and HTTP/2 Support​

Installation​

Basic Usage​

Configuration File​

Docker Deployment​

Sample Dockerfile​

Real-World Example: Complete FastAPI Application with Production Setup​

Project Structure​

Application Code (app/main.py)​

Sample Router (app/routers/items.py)​

Docker Compose Setup (docker-compose.yml)​

Production Start Script (start.sh)​

Monitoring and Health Checks​

Health Check Endpoint​

External Monitoring​

Performance Optimization Tips​

Scaling Strategies​

Horizontal Scaling​

Vertical Scaling​

Summary​

Additional Resources​

Exercises​

Introduction to Production Servers

Uvicorn in Production Mode

Basic Production Configuration

Using Multiple Workers

Advanced Uvicorn Configuration

Gunicorn with Uvicorn Workers

Installation

Basic Configuration

Using a Configuration File

Hypercorn for ASGI and HTTP/2 Support

Installation

Basic Usage

Configuration File

Docker Deployment

Sample Dockerfile

Real-World Example: Complete FastAPI Application with Production Setup

Project Structure

Application Code (app/main.py)

Sample Router (app/routers/items.py)

Docker Compose Setup (docker-compose.yml)

Production Start Script (start.sh)

Monitoring and Health Checks

Health Check Endpoint

External Monitoring

Performance Optimization Tips

Scaling Strategies

Horizontal Scaling

Vertical Scaling

Summary

Additional Resources

Exercises