PostgreSQL Streaming Replication
Introduction
PostgreSQL streaming replication is a powerful feature that allows you to create and maintain an up-to-date copy (or multiple copies) of your database on separate servers. This technology forms the backbone of many high-availability and disaster recovery solutions in PostgreSQL environments.
In this guide, we'll explore how PostgreSQL streaming replication works, why you might want to use it, and how to set it up step by step. By the end, you'll understand how to implement a robust replication system that can help protect your data and improve your application's reliability.
What is Streaming Replication?
Streaming replication is a built-in PostgreSQL feature that creates an exact copy of a database from a primary (formerly called "master") server to one or more replica (formerly called "slave") servers. It works by continuously sending a stream of Write-Ahead Log (WAL) records from the primary to the replicas.
Key Benefits
- High Availability: If your primary server fails, a replica can be promoted to become the new primary, minimizing downtime.
- Load Balancing: Read queries can be distributed across replicas, reducing the load on the primary server.
- Data Protection: Having multiple copies of your database protects against data loss due to hardware failure.
- Geographic Distribution: Replicas can be placed in different geographic locations for disaster recovery.
Prerequisites
Before setting up streaming replication, ensure you have:
- At least two PostgreSQL servers (same version) installed
- Network connectivity between the servers
- Sufficient disk space on the replica servers
- Administrative access to both servers
Setting Up Streaming Replication
Let's walk through the process of setting up streaming replication between two PostgreSQL servers:
Step 1: Configure the Primary Server
First, we need to modify the PostgreSQL configuration on the primary server to enable replication. Edit the postgresql.conf
file:
# In postgresql.conf on the primary server
listen_addresses = '*' # Listen on all available IP addresses
max_wal_senders = 10 # Maximum number of concurrent replication connections
wal_level = replica # Minimum WAL information needed for replication
max_replication_slots = 10 # Maximum number of replication slots
Next, edit the pg_hba.conf
file to allow replication connections from the replica:
# In pg_hba.conf on the primary server
# Allow replication connections from replica server (replace with your replica's IP)
host replication replicator 192.168.1.2/32 md5
Now restart PostgreSQL to apply the changes:
sudo systemctl restart postgresql
Step 2: Create a Replication User
On the primary server, create a dedicated user for replication:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';