MongoDB Oplog

Introduction

The MongoDB Oplog (Operations Log) is one of the most important components of MongoDB's replication mechanism. It's a special capped collection that records all operations that modify data in your MongoDB deployment. When you run MongoDB as a replica set, the primary node records all write operations in the oplog, and secondary nodes continuously copy and apply these operations to maintain data consistency across the replica set.

Understanding the oplog is essential for administrators and developers working with MongoDB replication, as it's the backbone that makes data synchronization possible across multiple nodes in a distributed MongoDB system.

What is the Oplog?

The oplog (short for "Operations Log") is:

A capped collection - A collection with a fixed size that works like a circular buffer
Located in the local database on each replica set member
Named oplog.rs
The mechanism that enables replication in MongoDB

When operations occur on the primary node that modify data (inserts, updates, deletes), these operations are recorded in the oplog. Secondary nodes then copy and replay these operations to stay in sync with the primary.

Oplog Structure

Let's examine what an oplog entry looks like. Each document in the oplog collection represents an operation and has the following structure:

{
  "ts" : Timestamp(1610000000, 1),     // Timestamp when the operation occurred
  "t" : NumberLong(1),                 // Term (used for election and replication)
  "h" : NumberLong("123456789"),       // Unique identifier for this operation
  "v" : 2,                             // Version of the oplog entry format
  "op" : "i",                          // Operation type: i=insert, u=update, d=delete, c=command, n=no-op
  "ns" : "mydb.mycollection",          // Namespace (database.collection)
  "o" : { ... },                       // Document being inserted/updated or the query for delete
  "o2" : { ... }                       // For updates: the query to select documents to update
}

The most common operation types you'll encounter in the oplog are:

i: Insert operation
u: Update operation
d: Delete operation
c: Command operation (like create collection, drop index)
n: No-op (does nothing, often used for internal purposes)

Examining the Oplog

To connect to your MongoDB's oplog and examine its contents, you can use the MongoDB shell:

// Connect to the local database which contains the oplog
use local

// Check the oplog.rs collection size and other details
db.oplog.rs.stats()

// View the most recent operations in the oplog
db.oplog.rs.find().sort({$natural: -1}).limit(5).pretty()

Example output:

// Example output from db.oplog.rs.stats()
{
  "capped": true,
  "size": 3753410560,  // ~3.5GB size
  "storageSize": 3753410560,
  "totalSize": 3753410560,
  "max": NumberLong("9223372036854775807"),
  "ok": 1
}

// Example output from oplog query
{
  "ts": Timestamp(1644851005, 1),
  "t": NumberLong(3),
  "h": NumberLong("6704921418682442802"),
  "v": 2,
  "op": "i",
  "ns": "inventory.products",
  "o": {
    "_id": ObjectId("620a2c6d4f9e7a9b1c3d5e7f"),
    "name": "Smartphone",
    "price": 599.99,
    "stock": 120
  }
}

Oplog Sizing

One of the most critical aspects of the MongoDB oplog is its size. By default, MongoDB allocates:

Approximately 5% of available disk space for the oplog on Linux and Windows
Approximately 1GB on 64-bit macOS

You can check the current oplog size using:

// Get oplog info including size and time range
db.printReplicationInfo()

Example output:

configured oplog size:   3557.05MB
log length start to end: 6403secs (1.78hrs)
oplog first event time:  Tue Feb 15 2022 10:30:02 GMT+0000
oplog last event time:   Tue Feb 15 2022 12:18:05 GMT+0000
now:                     Tue Feb 15 2022 12:20:15 GMT+0000

Why Oplog Size Matters

The oplog size directly impacts:

Replication Window: How far back in time a secondary can resync from
Resilience: Ability to recover after temporary network issues
Maintenance: Time available for taking nodes offline for maintenance

If the oplog is too small, operations might be overwritten before secondaries can replicate them, requiring a full resync (which is expensive and time-consuming).

Configuring Oplog Size

You can configure the oplog size when initiating a replica set or by resizing an existing oplog.

Setting Oplog Size on Replica Set Initialization

When starting a MongoDB instance that will be part of a replica set:

mongod --replSet myReplicaSet --oplogSize 8192 # Size in MB

Resizing an Existing Oplog

Resizing an existing oplog requires careful steps:

// Check current oplog size
use local
db.oplog.rs.stats().maxSize

// Resize oplog (must be executed on each member separately)
// 1. Stop the secondary
// 2. Restart in standalone mode
// 3. Connect and run:
use local
db.adminCommand({replSetResizeOplog: 1, size: 16384}) // 16GB

// 4. Restart as replica set member

Oplog Maintenance and Monitoring

Regular monitoring of your oplog is essential for a healthy replica set:

Checking Oplog Status

// Check how much of the oplog is used
db.printReplicationInfo()

// For more detailed information
db.serverStatus().oplogTruncation

Important Metrics to Monitor

Oplog Window Length: The time span covered by the oplog
Replication Lag: How far behind secondaries are from the primary
Oplog Operations Mix: Types of operations in your oplog

A common monitoring pattern is to alert if the oplog window drops below a certain threshold (e.g., 24 hours).

Real-World Applications

Use Case 1: Disaster Recovery Planning

// Calculate average operations per second
let stats = db.oplog.rs.stats();
let oplogSizeBytes = stats.maxSize;
let oplogUsedBytes = stats.size;
let firstOplogEntry = db.oplog.rs.find().sort({$natural: 1}).limit(1).next().ts;
let lastOplogEntry = db.oplog.rs.find().sort({$natural: -1}).limit(1).next().ts;
let oplogTimeSpanSeconds = (lastOplogEntry.getTime() - firstOplogEntry.getTime()) / 1000;

// If you have the following operations in the oplog:
let totalOps = db.oplog.rs.count(); 

// Average ops per second
let opsPerSecond = totalOps / oplogTimeSpanSeconds;
print(`Average operations per second: ${opsPerSecond}`);
print(`Estimated oplog capacity in hours: ${oplogTimeSpanSeconds / 3600}`);

This information helps you plan:

How much downtime your secondaries can handle
When to schedule maintenance
What size your oplog should be

Use Case 2: Custom Tailing for Change Data Capture (CDC)

You can create a script that "tails" the oplog to capture data changes for external systems:

// Store last processed timestamp
let lastTs = Timestamp(0, 0);

// Function to process new oplog entries
function tailOplog() {
    let cursor = db.oplog.rs.find({ts: {$gt: lastTs}}).sort({$natural: 1}).addOption(DBQuery.Option.tailable);
    
    while (cursor.hasNext()) {
        let doc = cursor.next();
        lastTs = doc.ts;
        
        // Process operations (inserts, updates, deletes)
        if (doc.op === 'i' && doc.ns === 'mydb.customers') {
            print(`New customer added: ${JSON.stringify(doc.o)}`);
            // Here you could send to another system
        }
    }
}

// Call this function in a loop to continuously process oplog

This pattern is the foundation for MongoDB change streams and many ETL processes.

Use Case 3: Investigating Data Modifications

During incident response, the oplog can help you trace what happened to your data:

// Find recent operations on a specific collection
db.oplog.rs.find(
    {
        ns: "mydb.important_collection",
        ts: {
            $gt: Timestamp(Math.floor(new Date('2023-03-01').getTime()/1000), 0),
            $lt: Timestamp(Math.floor(new Date('2023-03-02').getTime()/1000), 0)
        }
    }
).sort({ts: 1})

This allows you to see all operations performed on a collection during a specific timeframe.

Common Issues and Troubleshooting

Issue 1: Secondary Falls Too Far Behind

If a secondary node stops replicating with the error "oplog query failed: timestamp is outside oplog bounds", it means the oplog entries it needs were already overwritten.

Solution:

Increase oplog size to prevent future occurrences
Perform a full resync of the secondary:

// On the secondary that's falling behind
rs.syncFrom("primary_hostname:port")

// If that doesn't work, you may need to resync from scratch
rs.remove("problem_secondary_hostname:port")
// Then add it back
rs.add("problem_secondary_hostname:port")

Issue 2: Slow Oplog Apply Rate

If secondaries are consistently falling behind:

// Check replication status
rs.printSecondaryReplicationInfo()

// Sample output:
// source: secondary1.example.com:27017
// syncedTo: Mon Feb 14 2022 12:10:11 GMT+0000 (1 mins ago)
// 0 secs (0 hrs) behind the primary

Potential Solutions:

Increase resources on secondary nodes
Check for slow operations (using db.currentOp())
Optimize indexes on secondary nodes
Consider changing read preferences to distribute read load

Summary

The MongoDB oplog is a fundamental component that enables replication in MongoDB. It's a capped collection that records all write operations on the primary node, which secondary nodes then copy and apply to stay in sync. Key points to remember:

The oplog is a capped collection in the local database named oplog.rs
Its size is critical for determining how long secondaries can be offline
Operations are recorded in CRUD format (insert, update, delete, etc.)
Monitoring the oplog window is essential for system health

Understanding and properly configuring the oplog is crucial for maintaining a healthy MongoDB replica set, ensuring data consistency, and planning for maintenance and disaster recovery.

Additional Resources

MongoDB Documentation: MongoDB Oplog
MongoDB University: M103: Basic Cluster Administration

Exercises

Connect to your MongoDB replica set and check the current oplog size and time window.
Write a script that monitors oplog usage and alerts if the window drops below 24 hours.
Create a simple program that tails the oplog and logs all insert operations to a specific collection.
Calculate the average write operations per second in your MongoDB deployment using oplog data.
Practice resizing the oplog in a test environment and observe how it affects the replication window.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What is the Oplog?​

Oplog Structure​

Examining the Oplog​

Oplog Sizing​

Why Oplog Size Matters​

Configuring Oplog Size​

Setting Oplog Size on Replica Set Initialization​

Resizing an Existing Oplog​

Oplog Maintenance and Monitoring​

Checking Oplog Status​

Important Metrics to Monitor​

Real-World Applications​

Use Case 1: Disaster Recovery Planning​

Use Case 2: Custom Tailing for Change Data Capture (CDC)​

Use Case 3: Investigating Data Modifications​

Common Issues and Troubleshooting​

Issue 1: Secondary Falls Too Far Behind​

Issue 2: Slow Oplog Apply Rate​

Summary​

Additional Resources​

Exercises​