Skip to main content

MongoDB $limit Stage

Introduction

When working with MongoDB's aggregation framework, you'll often need to control the amount of data flowing through your pipeline. The $limit stage is a simple yet powerful operator that allows you to restrict the number of documents that pass to the next stage in your aggregation pipeline.

Whether you're implementing pagination, improving query performance, or just need to work with a smaller dataset, the $limit stage is an essential tool in your MongoDB aggregation toolkit.

What is the $limit Stage?

The $limit stage restricts the number of documents that pass to the next stage in the pipeline. It takes a positive integer that specifies the maximum number of documents to allow through.

Syntax

javascript
{ $limit: <positive integer> }

The value must be a positive integer. If you provide a floating-point number, MongoDB will truncate it to an integer.

Basic Usage Examples

Let's look at some simple examples of how to use the $limit stage in an aggregation pipeline.

Example 1: Simple Limit

Assume we have a collection called products with various items. To get only the first 5 products from the collection:

javascript
db.products.aggregate([
{ $limit: 5 }
])

This will return at most 5 documents from the products collection.

Example 2: Limit After Match

It's common to filter documents with $match before applying a limit:

javascript
db.products.aggregate([
{ $match: { category: "electronics" } },
{ $limit: 10 }
])

This pipeline will:

  1. First, filter products to include only electronics
  2. Then, limit the result to at most 10 documents

Understanding $limit Position in Pipeline

The position of the $limit stage in your pipeline can significantly impact both the result and the performance of your query.

Performance Optimization

For better performance, it's generally recommended to place the $limit stage as early as possible in your pipeline, especially after filtering operations like $match. This reduces the number of documents that subsequent stages need to process.

javascript
// More efficient
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $limit: 20 },
{ $sort: { orderDate: -1 } }
])

// Less efficient
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $sort: { orderDate: -1 } },
{ $limit: 20 }
])

However, there's an important exception: when using $sort and $limit together, it's often more efficient to place $limit after $sort. This is because MongoDB can use a top-k sort algorithm that's more efficient than sorting the entire result set.

Common Use Cases

Pagination

The most common use case for $limit is implementing pagination in web applications:

javascript
const pageSize = 10;
const pageNumber = 2; // 1-based page number

db.blogs.aggregate([
{ $match: { status: "published" } },
{ $sort: { publishDate: -1 } },
{ $skip: (pageNumber - 1) * pageSize },
{ $limit: pageSize }
])

This pipeline will return the second page of published blog posts, with 10 posts per page, sorted by publish date in descending order.

Sample Data Analysis

When working with large datasets, you might want to analyze just a sample:

javascript
db.sensorData.aggregate([
{ $match: { deviceId: "sensor001" } },
{ $limit: 100 },
{ $group: {
_id: null,
avgTemperature: { $avg: "$temperature" },
maxTemperature: { $max: "$temperature" },
minTemperature: { $min: "$temperature" }
}}
])

This example calculates statistics for a sample of 100 readings from a specific sensor.

Performance Testing

When testing new queries or aggregation pipelines, using $limit can help you quickly see results without processing the entire collection:

javascript
db.transactions.aggregate([
{ $match: { date: { $gte: new Date("2023-01-01") } } },
{ $group: {
_id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
totalAmount: { $sum: "$amount" }
}},
{ $sort: { _id: 1 } },
{ $limit: 10 } // Show just the first 10 days for testing
])

Working with limitandlimit and skip Together

The $limit stage is often used with $skip for pagination. Here's how they interact:

javascript
db.products.aggregate([
{ $match: { category: "books" } },
{ $sort: { publishedDate: -1 } },
{ $skip: 20 }, // Skip the first 20 results
{ $limit: 10 } // Return the next 10 results
])

When using both $skip and $limit, the $skip operation always happens first, even if $limit appears before $skip in the pipeline.

Common Pitfalls and Best Practices

Limits with Sorting

When using $limit with $sort, be careful about the order:

javascript
// This will limit first, then sort just those documents
db.orders.aggregate([
{ $limit: 10 },
{ $sort: { total: -1 } }
])

// This will sort all documents, then take the top 10
db.orders.aggregate([
{ $sort: { total: -1 } },
{ $limit: 10 }
])

The second example gives you the 10 orders with the highest totals, while the first example gives you 10 random orders sorted by total.

Large Skip Values

Be cautious with large $skip values, as they can be inefficient. MongoDB must scan and discard all the skipped documents:

javascript
// Inefficient for large collections and high page numbers
db.products.aggregate([
{ $skip: 10000 },
{ $limit: 10 }
])

For better pagination with large datasets, consider using range queries on indexed fields instead.

Real-World Example: E-commerce Analytics Dashboard

Let's build a more complete example for an e-commerce analytics dashboard that shows the top-selling products by category:

javascript
db.sales.aggregate([
// Filter to the current month
{ $match: {
orderDate: {
$gte: new Date("2023-10-01"),
$lt: new Date("2023-11-01")
}
}},

// Unwind the items array to work with individual product sales
{ $unwind: "$items" },

// Group by product and category to calculate total sales
{ $group: {
_id: {
productId: "$items.productId",
category: "$items.category"
},
productName: { $first: "$items.productName" },
totalQuantity: { $sum: "$items.quantity" },
totalRevenue: { $sum: { $multiply: ["$items.price", "$items.quantity"] }}
}},

// Sort by revenue within each category
{ $sort: { "_id.category": 1, "totalRevenue": -1 } },

// Group by category to create a top products array
{ $group: {
_id: "$_id.category",
topProducts: {
$push: {
productId: "$_id.productId",
productName: "$productName",
totalQuantity: "$totalQuantity",
totalRevenue: "$totalRevenue"
}
}
}},

// Limit the top products array to 5 items per category
{ $project: {
category: "$_id",
topProducts: { $slice: ["$topProducts", 5] },
_id: 0
}},

// Limit to the top 3 categories for the dashboard
{ $limit: 3 }
])

This complex aggregation pipeline:

  1. Filters sales to the current month
  2. Breaks down the items in each order
  3. Groups and calculates metrics by product
  4. Sorts products by revenue within each category
  5. Groups the top 5 products for each category
  6. Returns data for only the top 3 categories

Summary

The $limit stage in MongoDB aggregation is a straightforward but powerful tool for controlling the flow of documents in your pipeline. When used correctly, it can improve performance, implement pagination, and help you focus on the most relevant data.

Key takeaways:

  • The $limit stage restricts the number of documents passing to the next stage
  • Position matters: placing $limit early can improve performance
  • When used with $sort, place $limit after $sort for top-k operations
  • Use $limit with $skip for pagination
  • Be careful with large skip values in pagination scenarios

Additional Exercises

  1. Basic: Write an aggregation pipeline that returns the 5 most recent users who registered on your platform.

  2. Intermediate: Create a pipeline that returns the top 3 products in each category based on average review score (where each product has multiple reviews).

  3. Advanced: Implement a time-series data analysis pipeline that returns the hourly average of sensor readings, but limits the result to only hours where the average exceeds a certain threshold, and returns at most 24 data points.

Further Resources

Happy aggregating with MongoDB!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)