MongoDB Diagnostics

MongoDB is a powerful NoSQL database, but like any database system, it requires proper monitoring, diagnostics, and troubleshooting to ensure optimal performance. In this guide, we'll explore various diagnostic techniques and tools to help you identify and resolve issues in your MongoDB deployments.

Introduction to MongoDB Diagnostics

Database diagnostics is the process of analyzing, monitoring, and troubleshooting database performance and operational issues. For MongoDB administrators, having strong diagnostic skills is essential for:

Identifying performance bottlenecks
Troubleshooting connection issues
Analyzing query performance
Managing resource utilization
Ensuring data consistency and integrity
Planning for scaling and growth

Essential MongoDB Diagnostic Commands

1. Server Status

The serverStatus command provides a comprehensive overview of the database instance's current state.

javascript
// Get server status
db.adminCommand({ serverStatus: 1 })

Output (partial):

json
{
  "host": "mongodb-server",
  "version": "5.0.14",
  "process": "mongod",
  "pid": 12345,
  "uptime": 86400,
  "uptimeMillis": 86400000,
  "uptimeEstimate": 86400,
  "localTime": ISODate("2023-09-15T10:00:00Z"),
  "connections": {
    "current": 25,
    "available": 975,
    "totalCreated": 1050
  },
  "globalLock": { /* lock data */ },
  "mem": { 
    "bits": 64,
    "resident": 1024,
    "virtual": 4096,
    "supported": true
  },
  "network": { /* network stats */ },
  // Other sections...
}

2. Database Stats

Get statistics about a specific database:

javascript
// Get database statistics
db.stats()

Output:

json
{
  "db": "myDatabase",
  "collections": 12,
  "views": 0,
  "objects": 10000,
  "avgObjSize": 256,
  "dataSize": 2560000,
  "storageSize": 3000000,
  "indexes": 24,
  "indexSize": 500000,
  "totalSize": 3500000,
  "scaleFactor": 1,
  "fsUsedSize": 50000000000,
  "fsTotalSize": 100000000000,
  "ok": 1
}

3. Collection Stats

Examine statistics for a specific collection:

javascript
// Get collection statistics
db.users.stats()

Output:

json
{
  "ns": "myDatabase.users",
  "count": 5000,
  "size": 1250000,
  "avgObjSize": 250,
  "storageSize": 1500000,
  "capped": false,
  "nindexes": 3,
  "indexBuilds": [],
  "totalIndexSize": 300000,
  "indexSizes": {
    "_id_": 100000,
    "username_1": 100000,
    "email_1": 100000
  },
  "scaleFactor": 1,
  "ok": 1
}

Monitoring MongoDB Performance

Current Operations

The currentOp command shows operations currently in progress on the MongoDB server:

javascript
// View all current operations
db.adminCommand({ currentOp: 1 })

// View only long-running operations (>5 seconds)
db.adminCommand({ currentOp: 1, "secs_running": { "$gt": 5 } })

Output (partial):

json
{
  "inprog": [
    {
      "opid": 123456,
      "active": true,
      "secs_running": 10,
      "op": "query",
      "ns": "myDatabase.users",
      "command": {
        "find": "users",
        "filter": { "status": "active" }
      },
      "client": "192.168.1.100:40123",
      "desc": "conn32"
    },
    // More operations...
  ]
}

Killing Operations

If you identify a problematic or long-running operation, you can terminate it:

javascript
// Kill an operation by its opid
db.adminCommand({ killOp: 1, op: 123456 })

Output:

json
{ "info": "attempting to kill op", "ok": 1 }

Query Profiling

MongoDB's profiler is a powerful tool for identifying slow queries and performance bottlenecks.

Profiling Levels

javascript
// Check current profiler level
db.getProfilingStatus()

// Set profiling level
// 0: Off, 1: Collect slow queries, 2: Collect all queries
db.setProfilingLevel(1, { slowms: 100 })

Output:

json
// getProfilingStatus output
{ "was": 0, "slowms": 100, "sampleRate": 1 }

// setProfilingLevel output
{ "was": 0, "slowms": 100, "sampleRate": 1, "ok": 1 }

Accessing Profiling Data

javascript
// Query the system profiler
db.system.profile.find().sort({ ts: -1 }).limit(5)

// Find slow queries (execution time > 500ms)
db.system.profile.find({ millis: { $gt: 500 } }).sort({ ts: -1 })

Output:

json
[
  {
    "op": "query",
    "ns": "myDatabase.users",
    "command": {
      "find": "users",
      "filter": { "lastLogin": { "$gt": ISODate("2023-01-01") } }
    },
    "keysExamined": 0,
    "docsExamined": 10000,
    "cursorExhausted": true,
    "numYield": 5,
    "nreturned": 1200,
    "responseLength": 123456,
    "millis": 650,
    "ts": ISODate("2023-09-15T10:15:30Z"),
    "client": "192.168.1.100:45678",
    "appName": "MongoDB Compass"
  },
  // More profile records...
]

Analyzing Query Performance using `explain()`

The explain() method provides insights into how MongoDB executes queries and helps identify performance issues.

javascript
// Basic explain
db.users.find({ age: { $gt: 30 } }).explain()

// Execute with "executionStats" for detailed performance data
db.users.find({ age: { $gt: 30 } }).explain("executionStats")

// "allPlansExecution" shows all considered plans
db.users.find({ age: { $gt: 30 } }).explain("allPlansExecution")

Output (using executionStats):

json
{
  "explainVersion": "1",
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "myDatabase.users",
    "indexFilterSet": false,
    "parsedQuery": {
      "age": { "$gt": 30 }
    },
    "winningPlan": {
      "stage": "COLLSCAN",
      "filter": {
        "age": { "$gt": 30 }
      },
      "direction": "forward"
    },
    "rejectedPlans": []
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 2500,
    "executionTimeMillis": 45,
    "totalKeysExamined": 0,
    "totalDocsExamined": 5000,
    "executionStages": {
      "stage": "COLLSCAN",
      "filter": {
        "age": { "$gt": 30 }
      },
      "nReturned": 2500,
      "executionTimeMillisEstimate": 10,
      "works": 5002,
      "advanced": 2500,
      "needTime": 2501,
      "needYield": 0,
      "saveState": 39,
      "restoreState": 39,
      "isEOF": 1,
      "direction": "forward",
      "docsExamined": 5000
    }
  }
}

Key Performance Indicators in Explain Output

COLLSCAN vs. IXSCAN: Collection scans (COLLSCAN) are generally slower than index scans (IXSCAN)
totalDocsExamined: Number of documents scanned (lower is better)
totalKeysExamined: Number of index keys scanned
nReturned: Number of documents returned
executionTimeMillis: Total execution time in milliseconds

Diagnosing Memory Issues

Working Set Assessment

The working set is the portion of data and indexes that MongoDB keeps in RAM:

javascript
// Get WiredTiger cache statistics
db.serverStatus().wiredTiger.cache

Output:

json
{
  "bytes currently in the cache": 1073741824,
  "bytes read into cache": 10737418240,
  "bytes written from cache": 5368709120,
  "maximum bytes configured": 4294967296,
  "tracked dirty bytes in the cache": 268435456,
  "pages evicted by application threads": 0,
  "pages queued for eviction": 0,
  "pages read into cache": 50000,
  "pages written from cache": 25000
}

High eviction rates and cache misses indicate memory pressure. When this happens:

Consider increasing your WiredTiger cache size
Analyze your access patterns
Consider adding more RAM to your server
Review your indexing strategy

Monitoring I/O Performance

javascript
// Get storage engine stats
db.serverStatus().wiredTiger

Look for signs of I/O bottlenecks:

High read/write latencies
Queue buildup
Slow checkpoint times

Network and Connection Analysis

javascript
// Check connection status
db.serverStatus().connections

// Check network statistics
db.serverStatus().network

Output:

json
// Connections
{
  "current": 125,
  "available": 875,
  "totalCreated": 9876,
  "active": 115
}

// Network
{
  "bytesIn": 10485760,
  "bytesOut": 20971520,
  "numRequests": 50000
}

Implementing MongoDB Monitoring

For comprehensive MongoDB monitoring, consider:

Built-in MongoDB tools:
- MongoDB Compass
- MongoDB Cloud Manager
- MongoDB Ops Manager
Third-party monitoring tools:
- Prometheus with MongoDB exporter
- Grafana dashboards
- Datadog
- New Relic
Custom monitoring scripts:

javascript
// Simple monitoring script for connection count
const connectionStats = () => {
  const status = db.serverStatus();
  const conn = status.connections;
  const currentTime = new Date().toISOString();
  
  print(`${currentTime} - Connections: ${conn.current}/${conn.available} (${(conn.current / (conn.current + conn.available) * 100).toFixed(2)}% used)`);
  
  if (conn.current > conn.available * 0.8) {
    print("WARNING: Connection usage is above 80%");
  }
}

// Run every 5 minutes
setInterval(connectionStats, 5 * 60 * 1000);

Real-World Diagnostics Scenario: Identifying Slow Queries

Let's walk through a real-world scenario for identifying and fixing slow queries:

Step 1: Enable profiling for slow operations

javascript
// Enable profiling for operations taking longer than 100ms
db.setProfilingLevel(1, { slowms: 100 })

Step 2: Let your application run for a while

Step 3: Analyze slow queries

javascript
// Find the slowest queries
db.system.profile.find().sort({ millis: -1 }).limit(5)

Step 4: Examine a problematic query

Let's say we found a slow query for users with a specific status:

javascript
// The problematic query from the profile
db.users.find({ status: "active", lastLogin: { $gt: ISODate("2023-01-01") } })

// Run explain to understand execution
db.users.find({ 
  status: "active", 
  lastLogin: { $gt: ISODate("2023-01-01") } 
}).explain("executionStats")

Example problematic output:

json
{
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 5000,
    "executionTimeMillis": 650,
    "totalKeysExamined": 0,
    "totalDocsExamined": 100000,
    // ...
    "executionStages": {
      "stage": "COLLSCAN",
      // ...
    }
  }
}

Step 5: Create appropriate index and verify improvement

javascript
// Create compound index
db.users.createIndex({ status: 1, lastLogin: 1 })

// Verify the query now uses the index
db.users.find({ 
  status: "active", 
  lastLogin: { $gt: ISODate("2023-01-01") } 
}).explain("executionStats")

Improved output:

json
{
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 5000,
    "executionTimeMillis": 15,
    "totalKeysExamined": 5000,
    "totalDocsExamined": 5000,
    // ...
    "executionStages": {
      "stage": "IXSCAN",
      "indexName": "status_1_lastLogin_1",
      // ...
    }
  }
}

Best Practices for MongoDB Diagnostics

Implement proactive monitoring
- Set up alerts for key metrics
- Monitor trends over time
- Schedule regular performance reviews
Create diagnostic workflows
- Follow a structured approach to troubleshooting
- Document common issues and solutions
- Use a tiered approach from simple to complex diagnostics
Perform regular health checks

javascript
// Basic health check script
const healthCheck = () => {
  const serverStatus = db.serverStatus();
  const dbStats = db.stats();
  
  // Check connections
  const connPercentage = serverStatus.connections.current / 
                        (serverStatus.connections.current + serverStatus.connections.available) * 100;
  
  // Check memory usage
  const memoryUsage = serverStatus.mem.resident;
  const cacheUsage = serverStatus.wiredTiger.cache["bytes currently in the cache"] / 
                    serverStatus.wiredTiger.cache["maximum bytes configured"] * 100;
  
  // Output health summary
  print(`=== MongoDB Health Check (${new Date().toISOString()}) ===`);
  print(`Server version: ${serverStatus.version}`);
  print(`Uptime: ${(serverStatus.uptime / 3600).toFixed(2)} hours`);
  print(`Connection usage: ${connPercentage.toFixed(2)}%`);
  print(`Memory usage: ${Math.round(memoryUsage)} MB`);
  print(`Cache usage: ${cacheUsage.toFixed(2)}%`);
  print(`Collections: ${dbStats.collections}`);
  print(`Documents: ${dbStats.objects}`);
  print(`Storage size: ${(dbStats.storageSize / 1024 / 1024).toFixed(2)} MB`);
  
  // Warnings
  if (connPercentage > 80) print("⚠️ WARNING: High connection usage");
  if (cacheUsage > 90) print("⚠️ WARNING: Cache pressure detected");
  
  print("=======================================");
}

// Execute health check
healthCheck();

Visualizing MongoDB Performance Trends

Using the diagnostic data collected, we can visualize performance trends over time:

A common pattern for automated MongoDB diagnostics:

Summary

MongoDB diagnostics and monitoring are essential skills for database administrators. By leveraging the built-in diagnostic tools and commands, you can:

Identify and resolve performance bottlenecks
Ensure your MongoDB deployment is running optimally
Track key performance metrics over time
Plan for capacity and scaling needs
Proactively address issues before they impact users

Developing a strong diagnostic approach will help you maintain reliable, high-performance MongoDB deployments in production environments.

Additional Resources

Exercises

Set up profiling on a test database and identify the three slowest queries.
Create a script that checks for long-running operations and outputs warnings.
Use explain() to analyze and optimize three queries in your application.
Implement a simple monitoring script that tracks connection usage, memory consumption, and operation counts.
Design a MongoDB health check dashboard with key metrics for your specific application needs.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction to MongoDB Diagnostics​

Essential MongoDB Diagnostic Commands​

1. Server Status​

2. Database Stats​

3. Collection Stats​

Monitoring MongoDB Performance​

Current Operations​

Killing Operations​

Query Profiling​

Profiling Levels​

Accessing Profiling Data​

Analyzing Query Performance using explain()​

Key Performance Indicators in Explain Output​

Diagnosing Memory Issues​

Working Set Assessment​

Monitoring I/O Performance​

Network and Connection Analysis​

Implementing MongoDB Monitoring​

Real-World Diagnostics Scenario: Identifying Slow Queries​

Step 1: Enable profiling for slow operations​

Step 2: Let your application run for a while​

Step 3: Analyze slow queries​

Step 4: Examine a problematic query​

Step 5: Create appropriate index and verify improvement​

Best Practices for MongoDB Diagnostics​

Visualizing MongoDB Performance Trends​

Summary​

Additional Resources​

Exercises​

Introduction to MongoDB Diagnostics

Essential MongoDB Diagnostic Commands

1. Server Status

2. Database Stats

3. Collection Stats

Monitoring MongoDB Performance

Current Operations

Killing Operations

Query Profiling

Profiling Levels

Accessing Profiling Data

Analyzing Query Performance using `explain()`

Key Performance Indicators in Explain Output

Diagnosing Memory Issues

Working Set Assessment

Monitoring I/O Performance

Network and Connection Analysis

Implementing MongoDB Monitoring

Real-World Diagnostics Scenario: Identifying Slow Queries

Step 1: Enable profiling for slow operations

Step 2: Let your application run for a while

Step 3: Analyze slow queries

Step 4: Examine a problematic query

Step 5: Create appropriate index and verify improvement

Best Practices for MongoDB Diagnostics

Visualizing MongoDB Performance Trends

Summary

Additional Resources

Exercises