MongoDB Schema Evolution
In traditional relational databases, changing your schema often requires complex migrations and downtime. One of MongoDB's greatest strengths is its flexible schema, which allows documents in a collection to have different structures. This flexibility makes schema evolution easier, but it still requires careful planning and strategy to execute properly.
Understanding Schema Evolution
Schema evolution refers to the process of changing your data model over time as your application requirements change. In MongoDB, this can happen without requiring a strict migration of all documents at once.
Why Schema Evolution Matters
As applications grow and requirements change, your data model needs to adapt. Common reasons for schema evolution include:
- Adding new features requiring additional fields
- Improving performance by restructuring data
- Fixing design flaws in the original schema
- Combining or splitting fields for better organization
- Adapting to changing business requirements
Schema Evolution Strategies in MongoDB
MongoDB offers several strategies for evolving your schema without disrupting your application.
1. Schema Versioning
A common approach is to include a version field in your documents to track which schema version they conform to.
// Original schema (version 1)
{
"_id": ObjectId("5f8d5a4e7b9e7a1c8c8b4567"),
"name": "John Doe",
"email": "[email protected]",
"schemaVersion": 1
}
// Updated schema (version 2)
{
"_id": ObjectId("5f8d5a4e7b9e7a1c8c8b4568"),
"name": "Jane Smith",
"email": "[email protected]",
"contactInfo": {
"email": "[email protected]",
"phone": "555-123-4567"
},
"schemaVersion": 2
}
Your application code can then handle different schema versions appropriately:
function getContactInfo(user) {
if (user.schemaVersion === 1) {
return { email: user.email };
} else if (user.schemaVersion === 2) {
return user.contactInfo;
}
}
2. The Schema Migration Pattern
For larger changes, you might need to perform a migration. This can be done gradually in the background:
// Migration script to update from v1 to v2
db.users.find({ schemaVersion: 1 }).forEach(function(user) {
db.users.updateOne(
{ _id: user._id },
{
$set: {
contactInfo: { email: user.email },
schemaVersion: 2
},
$unset: { email: "" }
}
);
});
3. Dual-Write Pattern
During a transition period, your application can write to both old and new fields:
// When updating a user's email
function updateUserEmail(userId, newEmail) {
return db.users.updateOne(
{ _id: userId },
{
$set: {
email: newEmail, // For v1 schema compatibility
"contactInfo.email": newEmail // For v2 schema
}
}
);
}
4. On-Demand Migration
Convert documents only when they're accessed:
async function getUser(userId) {
const user = await db.users.findOne({ _id: userId });
// If it's an old version, update it when we read it
if (user.schemaVersion === 1) {
await db.users.updateOne(
{ _id: user._id },
{
$set: {
contactInfo: { email: user.email },
schemaVersion: 2
},
$unset: { email: "" }
}
);
user.contactInfo = { email: user.email };
user.schemaVersion = 2;
delete user.email;
}
return user;
}
Real-World Example: E-commerce Product Catalog
Let's walk through a real-world example of evolving a product catalog schema over time.
Initial Schema (v1)
{
"_id": ObjectId("5f8d5a4e7b9e7a1c8c8b4567"),
"name": "Wireless Headphones",
"price": 79.99,
"category": "Electronics",
"inStock": true,
"schemaVersion": 1
}
Business Requirements Change
The business now needs to:
- Support multiple price points based on region
- Track inventory counts instead of a simple boolean
- Support product variants (color, size)
Updated Schema (v2)
{
"_id": ObjectId("5f8d5a4e7b9e7a1c8c8b4567"),
"name": "Wireless Headphones",
"pricing": {
"US": 79.99,
"EU": 89.99,
"Asia": 75.99
},
"category": "Electronics",
"inventory": {
"total": 328,
"warehouses": {
"east": 122,
"west": 206
}
},
"variants": [
{
"color": "Black",
"sku": "WH-BLK-001"
},
{
"color": "White",
"sku": "WH-WHT-001"
}
],
"schemaVersion": 2
}
Migration Approach
Here's how we might handle this migration:
- Update the application to handle both schema versions
- Create a background migration task:
// Migration function
async function migrateProductsToV2() {
const cursor = db.products.find({ schemaVersion: 1 });
let migratedCount = 0;
await cursor.forEach(product => {
const update = {
$set: {
pricing: {
US: product.price,
EU: parseFloat((product.price * 1.1).toFixed(2)),
Asia: parseFloat((product.price * 0.95).toFixed(2))
},
inventory: {
total: product.inStock ? Math.floor(Math.random() * 500) + 100 : 0,
warehouses: {
east: 0,
west: 0
}
},
variants: [{
color: "Standard",
sku: `${product.name.substring(0,2).toUpperCase()}-STD-001`
}],
schemaVersion: 2
},
$unset: {
price: "",
inStock: ""
}
};
db.products.updateOne({ _id: product._id }, update);
migratedCount++;
if (migratedCount % 1000 === 0) {
console.log(`Migrated ${migratedCount} products...`);
}
});
console.log(`Migration complete. Total migrated: ${migratedCount}`);
}