MongoDB $project Stage
Introduction
The $project
stage is one of the most versatile and commonly used operators in MongoDB's aggregation pipeline. It allows you to reshape documents by including, excluding, or transforming fields to create the exact document structure you want in your results. Think of $project
as a document sculptor that lets you shape your data to fit your application's needs.
In this tutorial, we'll explore how to use the $project
stage effectively to:
- Include or exclude specific fields
- Rename fields
- Create computed fields
- Transform data types
- Work with arrays and nested objects
Basic Syntax
The basic syntax of the $project
stage is straightforward:
{ $project: { <specification(s)> } }
Where <specification(s)>
defines which fields to include, exclude, or transform in the output documents.
Including and Excluding Fields
Including Specific Fields
To include only specific fields in the output, set the field value to 1
:
db.collection.aggregate([
{ $project: {
field1: 1,
field2: 1,
_id: 0 // Explicitly exclude _id field
}
}
])
Note: By default, the _id
field is included in the output. To exclude it, you must explicitly set it to 0
.
Example: Including Specific Fields
Let's say we have a collection of books:
// Sample document in books collection
{
_id: ObjectId("5f7d1bd0c36d9c2c6a4b9c01"),
title: "MongoDB: The Definitive Guide",
author: "Shannon Bradshaw",
publishYear: 2019,
pages: 514,
category: "Database",
publisher: "O'Reilly Media"
}
To create a simplified view with just the title and author:
db.books.aggregate([
{
$project: {
_id: 0,
title: 1,
author: 1
}
}
])
Output:
{
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw"
}
Excluding Specific Fields
To exclude specific fields, set them to 0
:
db.collection.aggregate([
{ $project: {
fieldToExclude1: 0,
fieldToExclude2: 0
}
}
])
Important: You cannot mix inclusion and exclusion in the same $project
stage, except for the _id
field.
Example: Excluding Fields
To keep all fields except publisher and pages from our books collection:
db.books.aggregate([
{
$project: {
publisher: 0,
pages: 0
}
}
])
Output:
{
"_id": ObjectId("5f7d1bd0c36d9c2c6a4b9c01"),
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw",
"publishYear": 2019,
"category": "Database"
}
Renaming Fields
To rename a field, you can use the following syntax:
db.collection.aggregate([
{
$project: {
newFieldName: "$oldFieldName"
}
}
])
Example: Renaming Fields
Let's rename the publishYear
field to year
in our books collection:
db.books.aggregate([
{
$project: {
_id: 0,
title: 1,
author: 1,
year: "$publishYear"
}
}
])
Output:
{
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw",
"year": 2019
}
Creating Computed Fields
The $project
stage can create computed fields using expressions:
db.collection.aggregate([
{
$project: {
computedField: { $expression }
}
}
])
Example: Computing Fields with Arithmetic Operations
Let's calculate the book's age and add a boolean flag for books older than 5 years:
db.books.aggregate([
{
$project: {
_id: 0,
title: 1,
author: 1,
yearPublished: "$publishYear",
bookAge: { $subtract: [2023, "$publishYear"] },
isOlderThan5Years: { $gt: [{ $subtract: [2023, "$publishYear"] }, 5] }
}
}
])
Output:
{
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw",
"yearPublished": 2019,
"bookAge": 4,
"isOlderThan5Years": false
}
String Operations
The $project
stage also supports string operations:
db.books.aggregate([
{
$project: {
_id: 0,
title: 1,
author: 1,
publishYear: 1,
titleLength: { $strLenCP: "$title" },
titleUpperCase: { $toUpper: "$title" },
authorInitials: {
$concat: [
{ $substr: [{ $toUpper: "$author" }, 0, 1] },
"."
]
}
}
}
])
Output:
{
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw",
"publishYear": 2019,
"titleLength": 29,
"titleUpperCase": "MONGODB: THE DEFINITIVE GUIDE",
"authorInitials": "S."
}
Conditional Fields
You can use conditional expressions to determine field values:
db.books.aggregate([
{
$project: {
_id: 0,
title: 1,
author: 1,
publishYear: 1,
category: 1,
ageCategory: {
$cond: {
if: { $lt: ["$publishYear", 2015] },
then: "Older Book",
else: "Recent Book"
}
},
recommendationLevel: {
$switch: {
branches: [
{ case: { $eq: ["$category", "Database"] }, then: "Highly Recommended" },
{ case: { $eq: ["$category", "Programming"] }, then: "Recommended" }
],
default: "Optional Reading"
}
}
}
}
])
Output:
{
"title": "MongoDB: The Definitive Guide",
"author": "Shannon Bradshaw",
"publishYear": 2019,
"category": "Database",
"ageCategory": "Recent Book",
"recommendationLevel": "Highly Recommended"
}
Working with Arrays
The $project
stage provides powerful operators for array manipulation:
Array Element Access
db.courses.aggregate([
{
$project: {
_id: 0,
courseTitle: 1,
firstTopic: { $arrayElemAt: ["$topics", 0] },
lastTopic: { $arrayElemAt: ["$topics", -1] }
}
}
])
Array Transformation
Let's say we have a collection of courses with an array of topics:
// Sample document in courses collection
{
_id: ObjectId("5f7d1c40c36d9c2c6a4b9c02"),
courseTitle: "MongoDB Fundamentals",
instructor: "John Doe",
topics: ["Introduction", "CRUD Operations", "Aggregation", "Indexes", "Replication"],
durationHours: 15
}
We can transform the topics array:
db.courses.aggregate([
{
$project: {
_id: 0,
courseTitle: 1,
instructor: 1,
topicCount: { $size: "$topics" },
topicsUpperCase: { $map: {
input: "$topics",
as: "topic",
in: { $toUpper: "$$topic" }
}},
hasCRUDTopic: { $in: ["CRUD Operations", "$topics"] },
firstThreeTopics: { $slice: ["$topics", 0, 3] }
}
}
])
Output:
{
"courseTitle": "MongoDB Fundamentals",
"instructor": "John Doe",
"topicCount": 5,
"topicsUpperCase": ["INTRODUCTION", "CRUD OPERATIONS", "AGGREGATION", "INDEXES", "REPLICATION"],
"hasCRUDTopic": true,
"firstThreeTopics": ["Introduction", "CRUD Operations", "Aggregation"]
}
Working with Nested Objects
You can access and transform fields in nested objects using dot notation:
// Sample document in students collection
{
_id: ObjectId("5f7d1c80c36d9c2c6a4b9c03"),
name: "Alice Johnson",
contact: {
email: "[email protected]",
phone: "555-123-4567",
address: {
street: "123 Main St",
city: "New York",
zipcode: "10001"
}
},
grades: [85, 92, 78, 95]
}
We can extract and transform nested fields:
db.students.aggregate([
{
$project: {
_id: 0,
studentName: "$name",
email: "$contact.email",
city: "$contact.address.city",
fullAddress: {
$concat: [
"$contact.address.street", ", ",
"$contact.address.city", ", ",
"$contact.address.zipcode"
]
},
averageGrade: { $avg: "$grades" }
}
}
])
Output:
{
"studentName": "Alice Johnson",
"email": "[email protected]",
"city": "New York",
"fullAddress": "123 Main St, New York, 10001",
"averageGrade": 87.5
}
Real-world Example: E-commerce Data Analysis
Let's look at how $project
can be used in a real-world e-commerce application to prepare data for reports and analytics:
// Sample document in orders collection
{
_id: ObjectId("5f7d1cc0c36d9c2c6a4b9c04"),
orderId: "ORD12345",
customerId: "CUST987",
orderDate: ISODate("2023-03-15T14:30:00Z"),
items: [
{ product: "Laptop", price: 1299.99, quantity: 1 },
{ product: "Mouse", price: 24.99, quantity: 2 },
{ product: "Keyboard", price: 89.99, quantity: 1 }
],
shippingAddress: {
street: "456 Oak Ave",
city: "San Francisco",
state: "CA",
zipcode: "94107"
},
paymentMethod: "Credit Card",
shippingCost: 15.99,
taxAmount: 114.95
}
Now, let's create a sales report with the $project
stage:
db.orders.aggregate([
{
$project: {
_id: 0,
orderId: 1,
orderDate: 1,
customer: "$customerId",
location: {
city: "$shippingAddress.city",
state: "$shippingAddress.state"
},
itemCount: { $size: "$items" },
itemsSold: {
$map: {
input: "$items",
as: "item",
in: {
product: "$$item.product",
subtotal: { $multiply: ["$$item.price", "$$item.quantity"] }
}
}
},
subtotal: {
$reduce: {
input: "$items",
initialValue: 0,
in: {
$add: [
"$$value",
{ $multiply: ["$$this.price", "$$this.quantity"] }
]
}
}
},
totalAmount: {
$add: [
{
$reduce: {
input: "$items",
initialValue: 0,
in: {
$add: [
"$$value",
{ $multiply: ["$$this.price", "$$this.quantity"] }
]
}
}
},
"$shippingCost",
"$taxAmount"
]
},
orderMonth: { $month: "$orderDate" },
orderYear: { $year: "$orderDate" }
}
}
])
Output:
{
"orderId": "ORD12345",
"orderDate": ISODate("2023-03-15T14:30:00Z"),
"customer": "CUST987",
"location": {
"city": "San Francisco",
"state": "CA"
},
"itemCount": 3,
"itemsSold": [
{ "product": "Laptop", "subtotal": 1299.99 },
{ "product": "Mouse", "subtotal": 49.98 },
{ "product": "Keyboard", "subtotal": 89.99 }
],
"subtotal": 1439.96,
"totalAmount": 1570.90,
"orderMonth": 3,
"orderYear": 2023
}
This transformed document is now perfect for generating sales reports, with calculated totals, extracted location information, and structured product data.
Placement in Aggregation Pipeline
The $project
stage can be used multiple times throughout an aggregation pipeline. This allows you to reshape documents progressively as they move through the pipeline.
For example:
db.orders.aggregate([
// Filter for orders from 2023
{ $match: { orderDate: { $gte: ISODate("2023-01-01"), $lt: ISODate("2024-01-01") } } },
// First project stage: extract fields needed for grouping
{ $project: {
customerId: 1,
orderMonth: { $month: "$orderDate" },
totalAmount: { $add: [
{ $sum: { $map: { input: "$items", as: "item", in: { $multiply: ["$$item.price", "$$item.quantity"] } } } },
"$shippingCost",
"$taxAmount"
]}
}
},
// Group by customer and month
{ $group: {
_id: { customer: "$customerId", month: "$orderMonth" },
totalSpending: { $sum: "$totalAmount" },
orderCount: { $sum: 1 }
}
},
// Second project stage: final formatting
{ $project: {
_id: 0,
customerId: "$_id.customer",
month: "$_id.month",
totalSpending: { $round: ["$totalSpending", 2] },
orderCount: 1,
averageOrderValue: { $round: [{ $divide: ["$totalSpending", "$orderCount"] }, 2] }
}
},
// Sort by customer and month
{ $sort: { customerId: 1, month: 1 } }
])
Performance Considerations
-
Use early in the pipeline: Using
$project
early in the pipeline to reduce document size can improve performance by decreasing the amount of data processed by subsequent stages. -
Balance with readability: Multiple
$project
stages might improve code readability but could impact performance. For complex transformations, consider combining multiple operations into a single$project
stage. -
Avoid unnecessary projections: Only include the fields you need for subsequent stages or your final result.
Common Pitfalls
-
Mixing inclusion and exclusion: You cannot mix
1
and0
values in the same$project
stage, except for the_id
field. -
Missing dollar signs: When referencing a field, remember to use the dollar sign prefix (
$fieldName
). -
Overwriting fields: Be careful when computing fields with the same name as existing fields, as they will be overwritten.
Summary
The $project
stage is a powerful tool in MongoDB's aggregation framework that allows you to reshape your documents in numerous ways:
- Include or exclude specific fields
- Rename fields for more meaningful output
- Create computed fields using various operators
- Perform string operations and conditional logic
- Transform arrays and nested objects
This flexibility makes $project
essential for preparing your data for reporting, visualization, or further processing within your application.
Exercises
-
Given a collection of products with fields
name
,category
,price
, andstock
, use$project
to create a document with the name, price, a boolean fieldinStock
(true if stock > 0), and a string fieldpriceCategory
("Budget" if price < 50, "Mid-range" if price is between 50 and 200, "Premium" otherwise). -
For a collection of blog posts with fields
title
,content
,author
, andtags
(an array), use$project
to create a document with the title, author, tag count, and apreview
field that contains the first 100 characters of the content followed by "...". -
Create a
$project
stage that converts a document with temperatures in Celsius to Fahrenheit, using the formula F = C * 9/5 + 32.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)