MongoDB $skip Stage
In MongoDB's aggregation framework, the $skip
stage provides a powerful way to bypass a specified number of documents before continuing with the pipeline processing. This is particularly useful for pagination and when you need to process documents in batches.
Introduction
The $skip
stage is one of the fundamental stages in MongoDB's aggregation pipeline. Similar to the SQL OFFSET
clause, it allows you to exclude the first n documents from the results returned by the aggregation pipeline.
The $skip
stage takes a single parameter that specifies how many documents to skip:
{ $skip: <positive integer> }
Basic Usage
The basic syntax of the $skip
stage is straightforward:
db.collection.aggregate([
{ $skip: 10 }
])
This aggregation operation will skip the first 10 documents in the collection and return all documents starting from the 11th document.
How $skip Works
When MongoDB processes a $skip
stage:
- It counts documents as they flow through the pipeline
- It discards the specified number of documents
- It passes the remaining documents to the next stage in the pipeline
Consider this simple example with a collection of numbers:
// Sample data
db.numbers.insertMany([
{ value: 1 }, { value: 2 }, { value: 3 },
{ value: 4 }, { value: 5 }, { value: 6 },
{ value: 7 }, { value: 8 }, { value: 9 },
{ value: 10 }
])
// Skip the first 5 documents
db.numbers.aggregate([
{ $skip: 5 }
])
Output:
{ "_id": ObjectId("..."), "value": 6 }
{ "_id": ObjectId("..."), "value": 7 }
{ "_id": ObjectId("..."), "value": 8 }
{ "_id": ObjectId("..."), "value": 9 }
{ "_id": ObjectId("..."), "value": 10 }
Common Use Case: Pagination
One of the most common applications of the $skip
stage is implementing pagination in web applications. By combining $skip
with $limit
, you can create an efficient pagination system:
const pageSize = 10; // Number of documents per page
const pageNumber = 3; // Page number (1-based)
const skip = pageSize * (pageNumber - 1);
db.products.aggregate([
// Any filtering or matching can go here
{ $skip: skip },
{ $limit: pageSize }
])
This will retrieve the third page of products, skipping the first 20 documents (10 per page × (3-1) pages) and returning the next 10 documents.
Best Practices and Performance Considerations
While $skip
is useful, there are some important considerations to keep in mind:
-
Performance Impact: Using large values for
$skip
can be inefficient as MongoDB must still process all the skipped documents. As$skip
values increase, performance may degrade. -
**Position with skip
and
sort, always place the
skipand
$limit` stages to ensure consistent results. -
Alternative for Large Datasets: For large datasets, consider cursor-based pagination instead of using
$skip
, which involves using field values from the last document of the current page to query the next page.
// Instead of skip-based pagination for large collections:
db.products.aggregate([
{ $match: { price: { $gt: lastSeenPrice } } },
{ $sort: { price: 1 } },
{ $limit: pageSize }
])
Advanced Example: Report Generation with Skip
Imagine you're generating reports for an e-commerce platform and want to exclude the first week of data from a month:
db.sales.aggregate([
{ $match: {
date: {
$gte: ISODate("2023-07-01"),
$lte: ISODate("2023-07-31")
}
}
},
{ $sort: { date: 1 } },
// Skip the first 7 days of sales
{ $skip: 7 },
{ $group: {
_id: null,
totalSales: { $sum: "$amount" },
averageSale: { $avg: "$amount" },
count: { $sum: 1 }
}
}
])
This aggregation pipeline:
- Matches sales from July 2023
- Sorts them by date
- Skips the first week (7 days) of data
- Groups the remaining data to calculate total sales, average sale, and count
Combining with Other Stages
$skip
becomes even more powerful when combined with other aggregation stages. Here's how it fits into a more complex pipeline:
db.orders.aggregate([
// Stage 1: Filter orders by status
{ $match: { status: "completed" } },
// Stage 2: Sort by order date
{ $sort: { orderDate: -1 } },
// Stage 3: Skip the first 50 orders
{ $skip: 50 },
// Stage 4: Limit to 10 orders
{ $limit: 10 },
// Stage 5: Project only necessary fields
{ $project: {
orderId: 1,
customerName: 1,
totalAmount: 1,
shippingAddress: 1,
orderDate: 1,
_id: 0
}
}
])
This pipeline retrieves the 51st to 60th most recent completed orders with only the specified fields.
A Visual Representation of $skip
Common Mistakes to Avoid
- Skipping a negative number of documents: The
$skip
value must be a positive integer.
// This will throw an error
db.collection.aggregate([
{ $skip: -5 }
])
-
Using
$skip
without$sort
for pagination: Without sorting, the documents skipped may not be consistent between queries. -
Placing
$skip
after$limit
: This sequence won't work as expected since$limit
would restrict documents before$skip
can process them.
// Incorrect order
db.collection.aggregate([
{ $limit: 10 },
{ $skip: 5 } // This will only get 5 documents (10-5)
])
// Correct order
db.collection.aggregate([
{ $skip: 5 },
{ $limit: 10 } // This will skip 5 and get the next 10
])
Real-World Application: Blog Post Pagination
Let's implement a complete blog post pagination system:
function getBlogPosts(page, postsPerPage, category = null) {
const pipeline = [];
// Apply category filter if provided
if (category) {
pipeline.push({ $match: { categories: category } });
}
// Sort by publication date, newest first
pipeline.push({ $sort: { publishedDate: -1 } });
// Apply pagination
pipeline.push(
{ $skip: (page - 1) * postsPerPage },
{ $limit: postsPerPage }
);
// Project only necessary fields
pipeline.push({
$project: {
title: 1,
slug: 1,
excerpt: 1,
author: 1,
publishedDate: 1,
readTimeMinutes: 1,
tags: 1,
_id: 0
}
});
return db.blogPosts.aggregate(pipeline);
}
// Usage:
// getBlogPosts(2, 10, "technology")
This function will:
- Optionally filter posts by category
- Sort by publication date (newest first)
- Implement pagination using
$skip
and$limit
- Return only the fields needed for the blog listing
Summary
The $skip
stage in MongoDB's aggregation pipeline is an essential tool for:
- Implementing pagination in applications
- Excluding a specific number of documents from processing
- Creating batch processing jobs
- Working with data subsets
While $skip
is powerful and easy to use, remember that skipping large numbers of documents can impact performance. For optimal results, especially with large datasets, consider using cursor-based pagination or other techniques that utilize indexes more effectively.
Further Learning
To practice your understanding of the $skip
stage, try these exercises:
- Create a collection with 100 documents and practice retrieving different pages using
$skip
and$limit
- Implement a cursor-based pagination system and compare its performance with skip-based pagination
- Use
$skip
in a complex aggregation pipeline that includes$match
,$sort
,$group
, and$project
stages
Additional Resources
With these tools and techniques, you can efficiently control which documents flow through your aggregation pipelines and build powerful, performant MongoDB applications.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)