SQL Version Control

Introduction

Have you ever made a change to your database schema, only to realize it broke something else? Or worked on a database with multiple team members, unsure who made what changes and when? SQL version control solves these problems by providing a systematic approach to tracking and managing changes to your database structures and code.

In this guide, we'll explore how to implement version control for your SQL databases. You'll learn why it's essential, how it works, and practical approaches to integrate it into your workflow. Whether you're working alone or as part of a team, these practices will help you maintain a more stable, reliable database environment.

Why Version Control for SQL?

Traditional software development has long embraced version control systems like Git for tracking code changes. However, databases often remain outside this workflow, leading to several problems:

Untracked changes: Without version control, it's difficult to know who changed what and when
Deployment challenges: Moving database changes between environments becomes error-prone
Collaboration difficulties: Multiple developers can inadvertently overwrite each other's changes
Recovery problems: Restoring to a previous state becomes nearly impossible

By implementing SQL version control, you can:

Track all changes to your database schema and stored procedures
Roll back to previous versions when needed
Collaborate effectively with other developers
Deploy changes consistently across different environments
Maintain a comprehensive history of your database evolution

Getting Started with SQL Version Control

Approach 1: Migration Scripts

The most common approach to SQL version control is using migration scripts (also called change scripts). These are SQL files that represent incremental changes to your database.

How It Works

Each change to the database is written as a SQL script
Scripts are numbered sequentially and stored in your version control system
A separate table in your database tracks which scripts have been applied
When deploying, only the new scripts are executed

Let's see a simple example:

-- 001_create_users_table.sql
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) NOT NULL UNIQUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Add a comment for documentation
COMMENT ON TABLE users IS 'Stores user account information';

-- 002_add_user_status.sql
ALTER TABLE users 
ADD COLUMN status VARCHAR(20) DEFAULT 'active' NOT NULL;

-- Add an index for faster lookups by status
CREATE INDEX idx_user_status ON users(status);

To track which migrations have been applied, you'll need a version table:

-- 000_create_version_table.sql
CREATE TABLE schema_versions (
    version_number INT PRIMARY KEY,
    script_name VARCHAR(100) NOT NULL,
    applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    applied_by VARCHAR(50) NOT NULL
);

When you apply a migration, you also record it:

-- After applying migration 001
INSERT INTO schema_versions (version_number, script_name, applied_by) 
VALUES (1, '001_create_users_table.sql', 'developer1');

Approach 2: State-Based Version Control

Another approach is state-based version control, where you describe the desired final state of your database rather than the steps to get there.

How It Works

You maintain SQL files that describe how your database should look
A tool compares your description with the actual database
The tool generates the necessary changes to align the database with your description

Here's an example of a state description:

-- users_table.sql (current state description)
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) NOT NULL UNIQUE,
    status VARCHAR(20) DEFAULT 'active' NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_user_status ON users(status);
COMMENT ON TABLE users IS 'Stores user account information';

Tools for SQL Version Control

Several tools can help you implement version control for SQL:

1. Flyway

Flyway is a popular migration-based database version control tool that works with most major databases.

Example usage:

First, create a migration script:

-- V1__Create_users_table.sql
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) NOT NULL UNIQUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Then run Flyway to apply it:

# Command line execution
flyway -url=jdbc:mysql://localhost:3306/mydb -user=root migrate

Flyway automatically creates a table called flyway_schema_history to track applied migrations.

Output:

Flyway Community Edition 8.5.13 by Redgate
Database: jdbc:mysql://localhost:3306/mydb (MySQL 8.0)
Successfully validated 1 migration
Creating Schema History table `mydb`.`flyway_schema_history` ...
Current version of schema `mydb`: << Empty Schema >>
Migrating schema `mydb` to version 1 - Create users table
Successfully applied 1 migration to schema `mydb` (execution time 00:00.124s)

2. Liquibase

Liquibase is another popular tool that supports both migration-based and state-based approaches.

Example usage:

Create a changelog file:

<!-- changelog.xml -->
<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
                        http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.9.xsd">

    <changeSet id="1" author="developer1">
        <createTable tableName="users">
            <column name="id" type="int" autoIncrement="true">
                <constraints primaryKey="true" nullable="false"/>
            </column>
            <column name="username" type="varchar(50)">
                <constraints nullable="false" unique="true"/>
            </column>
            <column name="email" type="varchar(100)">
                <constraints nullable="false" unique="true"/>
            </column>
            <column name="created_at" type="timestamp" defaultValueComputed="CURRENT_TIMESTAMP"/>
        </createTable>
    </changeSet>
</databaseChangeLog>

Run Liquibase to apply the changes:

# Command line execution
liquibase --changeLogFile=changelog.xml update

3. SQL Source Control

For those using SQL Server Management Studio, SQL Source Control integrates directly with your IDE.

4. Sqitch

Sqitch is a database-agnostic version control system that doesn't require a schema table.

Integrating with Git

To fully leverage version control, you should store your SQL scripts or state descriptions in a Git repository. Here's a typical workflow:

Create a dedicated directory for database changes in your repository
Follow a consistent naming convention for your migration scripts
Never modify a script that has been committed and potentially applied
Use branches for experimental database changes

Here's an example project structure:

my-project/
├── src/
│   └── ...
├── db/
│   ├── migrations/
│   │   ├── 001_create_users_table.sql
│   │   ├── 002_add_user_status.sql
│   │   └── 003_create_posts_table.sql
│   └── scripts/
│       ├── apply_migrations.sh
│       └── rollback_last_migration.sh
└── README.md

Database Change Management Workflow

Let's look at a typical workflow for managing database changes:

Best Practices

Always script changes: Never make direct changes to the database schema
One change per script: Keep migrations focused on a single logical change
Make migrations reversible: Include both "up" and "down" scripts
Test before committing: Verify your migrations work as expected
Use meaningful names: Name your scripts according to their purpose
Include documentation: Comment your scripts to explain the why, not just the how

Example of a Reversible Migration

-- 004_add_last_login_column.sql

-- Up migration
ALTER TABLE users
ADD COLUMN last_login TIMESTAMP NULL;

-- Down migration (commented for safety, to be used with rollback tools)
/*
ALTER TABLE users
DROP COLUMN last_login;
*/

Real-World Example: Building a Blog Platform

Let's walk through a complete example of using migration scripts to develop a simple blog platform database:

Step 1: Initial User Table

-- 001_create_users_table.sql
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) NOT NULL UNIQUE,
    password_hash VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Step 2: Add Posts Table

-- 002_create_posts_table.sql
CREATE TABLE posts (
    id INT PRIMARY KEY AUTO_INCREMENT,
    user_id INT NOT NULL,
    title VARCHAR(200) NOT NULL,
    content TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (user_id) REFERENCES users(id)
);

Step 3: Add Comments Feature

-- 003_create_comments_table.sql
CREATE TABLE comments (
    id INT PRIMARY KEY AUTO_INCREMENT,
    post_id INT NOT NULL,
    user_id INT NOT NULL,
    content TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (post_id) REFERENCES posts(id),
    FOREIGN KEY (user_id) REFERENCES users(id)
);

Step 4: Add User Profile Information

-- 004_add_user_profile_columns.sql
ALTER TABLE users
ADD COLUMN full_name VARCHAR(100) NULL,
ADD COLUMN bio TEXT NULL,
ADD COLUMN profile_image_url VARCHAR(255) NULL;

Step 5: Applying the Changes

To apply these changes, you might use a tool like Flyway:

# Apply all migrations
flyway -url=jdbc:mysql://localhost:3306/blog_db -user=dev migrate

Or a custom script:

# apply_migrations.py
import mysql.connector
import os
import re

def get_applied_versions(cursor):
    cursor.execute("CREATE TABLE IF NOT EXISTS schema_versions (version INT PRIMARY KEY, script_name VARCHAR(100), applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)")
    cursor.execute("SELECT version FROM schema_versions ORDER BY version")
    return [row[0] for row in cursor.fetchall()]

def apply_migrations(conn, cursor, migrations_folder):
    applied = get_applied_versions(cursor)
    
    # Get all migration files
    migration_files = [f for f in os.listdir(migrations_folder) if f.endswith('.sql')]
    
    for file in sorted(migration_files):
        # Extract version from filename (e.g., 001_create_users_table.sql → 1)
        match = re.match(r'(\d+)_.*\.sql', file)
        if not match:
            continue
            
        version = int(match.group(1))
        
        # Skip already applied migrations
        if version in applied:
            print(f"Skipping migration {file} (already applied)")
            continue
            
        print(f"Applying migration {file}...")
        
        # Read and execute the migration
        with open(os.path.join(migrations_folder, file), 'r') as f:
            sql = f.read()
            cursor.execute(sql)
            
        # Record the migration
        cursor.execute("INSERT INTO schema_versions (version, script_name) VALUES (%s, %s)", (version, file))
        conn.commit()
        
        print(f"Successfully applied migration {file}")

# Connect and apply migrations
conn = mysql.connector.connect(host='localhost', database='blog_db', user='dev', password='password')
cursor = conn.cursor()
apply_migrations(conn, cursor, './migrations')
cursor.close()
conn.close()

Common Challenges and Solutions

Challenge 1: Handling Data Migrations

Schema changes are relatively straightforward, but moving or transforming existing data requires special care.

Solution: Create separate migration scripts specifically for data:

-- 005_migrate_user_names_to_profile.sql

-- First, ensure the required columns exist
ALTER TABLE users
ADD COLUMN IF NOT EXISTS full_name VARCHAR(100) NULL;

-- Then migrate the data
UPDATE users
SET full_name = username
WHERE full_name IS NULL;

Challenge 2: Managing Dependencies Between Migrations

Sometimes, migrations depend on each other in complex ways.

Solution: Use a version control tool that supports dependencies, or carefully document prerequisites in your migration scripts:

-- 006_create_user_roles.sql
-- Depends on: 001_create_users_table.sql

CREATE TABLE user_roles (
    user_id INT NOT NULL,
    role VARCHAR(20) NOT NULL,
    PRIMARY KEY (user_id, role),
    FOREIGN KEY (user_id) REFERENCES users(id)
);

Challenge 3: Production Deployments

Deploying changes to production brings additional challenges around downtime and risk.

Solution: Follow these practices for production deployments:

Always back up the database before applying migrations
Test migrations thoroughly in staging environments
Schedule migrations during low-traffic periods
Use transactions when possible to allow rollbacks
Consider using tools that support zero-downtime migrations

Summary

SQL version control is an essential practice for modern database development. By implementing a structured approach to tracking and managing database changes, you can:

Maintain a complete history of database schema evolution
Collaborate effectively within development teams
Safely deploy changes across multiple environments
Recover from problematic changes when necessary

Whether you choose a migration-based approach with tools like Flyway and Liquibase, or a state-based approach with schema comparison tools, the most important step is to establish a consistent process that works for your team and project.

Remember these key principles:

Never make direct changes to production databases
Always script and version all database changes
Test thoroughly before deploying
Keep migrations small and focused
Document your schema changes

Additional Resources

Tools

Learning Resources

"Database Refactoring" by Scott Ambler and Pramod Sadalage
"Refactoring Databases: Evolutionary Database Design" by Scott Ambler
"Continuous Delivery" by Jez Humble and David Farley (Chapter on Database Migration)

Practice Exercises

Basic Migration Practice: Create a series of migration scripts for a simple e-commerce database with customers, products, and orders tables.
Tool Exploration: Compare Flyway and Liquibase by implementing the same database changes with both tools.
Team Workflow Challenge: Set up a Git repository with database migrations and create a workflow document for your team to follow when making database changes.
Advanced Scenario: Create a migration that splits a full_address column into separate street, city, state, and zip_code columns, while preserving the existing data.
State-Based Practice: Create a state-based schema representation using your tool of choice, and practice generating differential scripts against various database states.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Why Version Control for SQL?​

Getting Started with SQL Version Control​

Approach 1: Migration Scripts​

How It Works​

Approach 2: State-Based Version Control​

How It Works​

Tools for SQL Version Control​

1. Flyway​

2. Liquibase​

3. SQL Source Control​

4. Sqitch​

Integrating with Git​

Database Change Management Workflow​

Best Practices​

Example of a Reversible Migration​

Real-World Example: Building a Blog Platform​

Step 1: Initial User Table​

Step 2: Add Posts Table​

Step 3: Add Comments Feature​

Step 4: Add User Profile Information​

Step 5: Applying the Changes​

Common Challenges and Solutions​

Challenge 1: Handling Data Migrations​

Challenge 2: Managing Dependencies Between Migrations​

Challenge 3: Production Deployments​

Summary​

Additional Resources​

Tools​

Learning Resources​

Practice Exercises​

Introduction

Why Version Control for SQL?

Getting Started with SQL Version Control

Approach 1: Migration Scripts

How It Works

Approach 2: State-Based Version Control

How It Works

Tools for SQL Version Control

1. Flyway

2. Liquibase

3. SQL Source Control

4. Sqitch

Integrating with Git

Database Change Management Workflow

Best Practices

Example of a Reversible Migration

Real-World Example: Building a Blog Platform

Step 1: Initial User Table

Step 2: Add Posts Table

Step 3: Add Comments Feature

Step 4: Add User Profile Information

Step 5: Applying the Changes

Common Challenges and Solutions

Challenge 1: Handling Data Migrations

Challenge 2: Managing Dependencies Between Migrations

Challenge 3: Production Deployments

Summary

Additional Resources

Tools

Learning Resources

Practice Exercises