Skip to main content

Django Multi-database Setup

Introduction

In most Django applications, using a single database is sufficient. However, as your application grows or faces specific requirements, you might need to work with multiple databases. Django provides robust support for connecting to and managing multiple databases within the same project.

This guide will walk you through the process of setting up multiple databases in Django, managing database routing, performing cross-database queries, and implementing practical multi-database patterns.

Why Use Multiple Databases?

Before diving into implementation, it's important to understand when and why you might need multiple databases:

  • Data segregation: Keeping different types of data separate (e.g., user data vs. analytics data)
  • Read/write splitting: Using read replicas for queries and a primary database for writes
  • Legacy integration: Connecting to existing databases while maintaining a separate database for new features
  • Sharding: Distributing data across multiple databases for scalability
  • Multi-tenancy: Separate databases for different clients or tenants

Basic Multi-database Configuration

Step 1: Define Database Connections

Django configures databases in the settings.py file. For multiple databases, you'll expand the DATABASES dictionary to include additional connections:

python
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'primary_db',
'USER': 'postgres',
'PASSWORD': 'password',
'HOST': 'localhost',
'PORT': '5432',
},
'analytics': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'analytics_db',
'USER': 'postgres',
'PASSWORD': 'password',
'HOST': 'localhost',
'PORT': '5432',
},
'archive': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'archive_db',
'USER': 'mysql_user',
'PASSWORD': 'mysql_password',
'HOST': 'localhost',
'PORT': '3306',
}
}

In this example, we've defined three database connections:

  • default: The primary database (required)
  • analytics: A PostgreSQL database for analytics data
  • archive: A MySQL database for archiving old data

Step 2: Run Migrations for All Databases

By default, Django migrations run only on the default database. To apply migrations to additional databases, use the --database option with the migrate command:

bash
# Apply migrations to the default database
python manage.py migrate

# Apply migrations to the analytics database
python manage.py migrate --database=analytics

# Apply migrations to the archive database
python manage.py migrate --database=archive

Database Routers

Database routers determine which database should be used for read and write operations. They help Django decide:

  • Which database to use for reading data (e.g., SELECT statements)
  • Which database to use for writing data (e.g., INSERT, UPDATE, DELETE)
  • Which database to use for migrations
  • Whether relationships should be allowed between models across different databases

Creating a Custom Database Router

Create a new file called routers.py in your project directory:

python
# myproject/routers.py
class AnalyticsRouter:
"""
Router to send analytics-related models to the analytics database
"""
analytics_apps = {'analytics', 'reporting'}

def db_for_read(self, model, **hints):
"""
Attempts to read analytics models go to analytics_db.
"""
if model._meta.app_label in self.analytics_apps:
return 'analytics'
return None

def db_for_write(self, model, **hints):
"""
Attempts to write analytics models go to analytics_db.
"""
if model._meta.app_label in self.analytics_apps:
return 'analytics'
return None

def allow_relation(self, obj1, obj2, **hints):
"""
Relations between objects are allowed if both objects are
in the analytics app, or neither is.
"""
if (
obj1._meta.app_label in self.analytics_apps and
obj2._meta.app_label in self.analytics_apps
):
return True
elif (
obj1._meta.app_label not in self.analytics_apps and
obj2._meta.app_label not in self.analytics_apps
):
return True
return False

def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
Make sure the analytics app only appears in the 'analytics_db'
database.
"""
if app_label in self.analytics_apps:
return db == 'analytics'
return None

Register the Router

After creating a router, register it in your settings.py file:

python
# settings.py
DATABASE_ROUTERS = ['myproject.routers.AnalyticsRouter']

You can register multiple routers, and Django will check them in order until one returns a database or None.

Working with Multiple Databases in Code

Specifying the Database for Queries

You can specify which database to use for queries using the using method:

python
# Get all User objects from the default database
users = User.objects.all()

# Get all User objects from the analytics database
users = User.objects.using('analytics').all()

# Get a specific user from the archive database
user = User.objects.using('archive').get(username='johndoe')

Saving Models to a Specific Database

You can save model instances to a specific database:

python
# Create a new user in the default database
user = User(username='newuser', email='[email protected]')
user.save()

# Create a new user in the analytics database
user = User(username='analytics_user', email='[email protected]')
user.save(using='analytics')

# Update a user in the archive database
user = User.objects.using('archive').get(username='johndoe')
user.email = '[email protected]'
user.save(using='archive')

Practical Multi-database Patterns

Pattern 1: Read Replicas for Scaling

A common use case is to distribute read queries across multiple database replicas while sending all write operations to a primary database:

python
# myproject/routers.py
class ReadReplicaRouter:
def db_for_read(self, model, **hints):
"""
Reads go to a randomly-chosen read replica.
"""
import random
replicas = ['replica1', 'replica2', 'replica3']
return random.choice(replicas)

def db_for_write(self, model, **hints):
"""
Writes always go to primary.
"""
return 'default'

def allow_relation(self, obj1, obj2, **hints):
"""
Relations between objects are allowed if they use the same database.
"""
db_set = {'default', 'replica1', 'replica2', 'replica3'}
return True

def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
All non-replica databases should run migrations.
"""
return db == 'default'

Pattern 2: Multi-tenancy (One Database per Client)

For applications serving multiple clients or tenants, you might want to keep each client's data isolated in separate databases:

python
# myproject/routers.py
class TenantRouter:
def db_for_read(self, model, **hints):
# Get the current tenant from the request or thread local storage
from myproject.tenant import get_current_tenant
tenant = get_current_tenant()
if tenant:
return f'tenant_{tenant.id}'
return 'default'

def db_for_write(self, model, **hints):
# Same as read - route to tenant database
from myproject.tenant import get_current_tenant
tenant = get_current_tenant()
if tenant:
return f'tenant_{tenant.id}'
return 'default'

def allow_migrate(self, db, app_label, model_name=None, **hints):
# Only run migrations on the default database
# Tenant databases should be created as clones of a template
return db == 'default'

Pattern 3: Data Archiving

For applications that need to archive old data to a separate database:

python
# myproject/routers.py
class ArchiveRouter:
def db_for_read(self, model, **hints):
if model._meta.app_label == 'archive':
return 'archive'
return 'default'

def db_for_write(self, model, **hints):
if model._meta.app_label == 'archive':
return 'archive'
return 'default'

def allow_migrate(self, db, app_label, model_name=None, **hints):
if app_label == 'archive':
return db == 'archive'
return db == 'default'

Advanced Usage: Cross-Database Relationships

Django generally discourages relationships (ForeignKey, OneToOneField, ManyToManyField) between models in different databases since JOIN operations across databases aren't supported by most database backends.

If you need to reference data across databases, consider these approaches:

Option 1: Duplicate Key Values

Store the same primary key values in both databases:

python
# models.py
class User(models.Model):
username = models.CharField(max_length=100)
# Other fields...

class Meta:
app_label = 'auth'

class UserAnalytics(models.Model):
# Same ID as the user in the default database
user_id = models.IntegerField(primary_key=True)
visit_count = models.IntegerField(default=0)
last_visit = models.DateTimeField(null=True)

class Meta:
app_label = 'analytics'

Option 2: Use a Custom Field

Create a custom field that handles cross-database lookups:

python
# fields.py
class CrossDatabaseForeignKey(models.IntegerField):
def __init__(self, to, db, *args, **kwargs):
self.to = to
self.related_db = db
super().__init__(*args, **kwargs)

def get_related_object(self):
return self.to.objects.using(self.related_db).get(pk=self.value)

Performance Considerations

Working with multiple databases introduces additional complexity and potential performance issues:

  1. Connection overhead: Each database connection consumes resources
  2. Transaction management: Ensuring data consistency across multiple databases
  3. N+1 query problems: Accessing related objects across databases can lead to excessive queries

Tips for Optimizing Multi-database Setups

  1. Connection pooling: Use a connection pool like PgBouncer for PostgreSQL
  2. Batch operations: When querying multiple databases, use batch operations to reduce connection switching
  3. Caching: Implement caching for frequently accessed cross-database data
  4. Query optimization: Use select_related() and prefetch_related() where possible
  5. Consider data duplication: Sometimes it's more efficient to duplicate some data across databases than to make cross-database queries

Summary

Django's multi-database support provides powerful capabilities for scaling and organizing your data. In this guide, we've covered:

  • Basic multi-database configuration
  • Creating and registering database routers
  • Specifying which database to use for queries and writes
  • Practical patterns for multi-database setups
  • Handling relationships between models in different databases
  • Performance considerations and optimizations

By leveraging these features, you can design Django applications that scale efficiently, integrate with legacy systems, or meet complex data segregation requirements.

Additional Resources

Exercises

  1. Set up a Django project with two databases: one for user authentication and another for blog posts.
  2. Create a database router that routes user models to one database and blog models to another.
  3. Write a view that fetches data from both databases and displays it on a single page.
  4. Implement a data archiving system that moves old records from the primary database to an archive database.
  5. Create a multi-tenant application where each tenant's data is stored in a separate database.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)