MongoDB Aggregation Expressions
In MongoDB's aggregation framework, expressions are the building blocks that allow you to transform, evaluate, and manipulate data as it flows through your aggregation pipeline. They provide the logic for calculations, comparisons, and data transformations, giving you powerful tools to shape your data exactly how you need it.
Introduction to Aggregation Expressions
Aggregation expressions are used within the stages of an aggregation pipeline to operate on data. They can:
- Perform mathematical operations
- Manipulate strings
- Work with arrays
- Convert data types
- Evaluate conditions
- Format dates and times
Unlike simple field references in queries, aggregation expressions provide a rich syntax for complex transformations and calculations.
Basic Expression Syntax
In MongoDB, expressions are typically written using an operator preceded by a dollar sign ($
). Expressions can be:
- Field paths - References to fields in the documents (e.g.,
$fieldName
) - Operators - Functions that perform operations (e.g.,
$sum
,$concat
) - Literals - Constant values like numbers, strings, booleans
- Expression objects - Combinations of operators and values
Let's see a simple example:
// Example document
{
"_id": 1,
"name": "Product A",
"price": 25,
"quantity": 10
}
// Aggregation with expression
db.products.aggregate([
{
$project: {
name: 1,
totalValue: { $multiply: ["$price", "$quantity"] }
}
}
])
// Output
{
"_id": 1,
"name": "Product A",
"totalValue": 250
}
In this example, { $multiply: ["$price", "$quantity"] }
is an expression that multiplies the values from two fields.
Types of Expressions
Arithmetic Expressions
MongoDB provides operators for common mathematical operations:
db.sales.aggregate([
{
$project: {
item: 1,
total: { $add: ["$price", "$tax"] },
discountedPrice: { $subtract: ["$price", "$discount"] },
quantity: 1,
totalRevenue: { $multiply: ["$price", "$quantity"] },
unitPrice: { $divide: ["$totalAmount", "$quantity"] },
modResult: { $mod: ["$value", 10] }
}
}
])
String Expressions
For manipulating text data:
db.customers.aggregate([
{
$project: {
_id: 0,
fullName: { $concat: ["$firstName", " ", "$lastName"] },
nameLength: { $strLenCP: "$firstName" },
upperName: { $toUpper: "$firstName" },
lowerName: { $toLower: "$lastName" },
substring: { $substr: ["$firstName", 0, 1] }
}
}
])
Boolean Expressions
For logical operations and comparisons:
db.inventory.aggregate([
{
$project: {
item: 1,
qty: 1,
status: {
$cond: {
if: { $gte: ["$qty", 20] },
then: "In Stock",
else: "Low Stock"
}
},
needsRestock: { $lt: ["$qty", 10] }
}
}
])
Array Expressions
For working with array fields:
db.orders.aggregate([
{
$project: {
orderId: 1,
itemCount: { $size: "$items" },
firstItem: { $arrayElemAt: ["$items", 0] },
hasSpecialItems: { $in: ["special", "$tags"] },
allItems: { $concatArrays: ["$mainItems", "$addOns"] }
}
}
])
Date Expressions
For working with dates:
db.sales.aggregate([
{
$project: {
date: 1,
year: { $year: "$date" },
month: { $month: "$date" },
day: { $dayOfMonth: "$date" },
dayOfWeek: { $dayOfWeek: "$date" },
formattedDate: {
$dateToString: { format: "%Y-%m-%d", date: "$date" }
}
}
}
])
Variable Expressions
The $let
operator lets you define variables for use within an expression:
db.orders.aggregate([
{
$project: {
order: 1,
total: {
$let: {
vars: {
basePrice: "$price",
taxRate: 0.08
},
in: {
$sum: [
"$$basePrice",
{ $multiply: ["$$basePrice", "$$taxRate"] }
]
}
}
}
}
}
])
Note the double dollar sign ($$
) used to refer to variables.
Conditional Expressions
MongoDB offers several ways to perform conditional logic:
$cond
db.inventory.aggregate([
{
$project: {
item: 1,
discount: {
$cond: {
if: { $gt: ["$qty", 250] },
then: 30,
else: 20
}
}
}
}
])
$switch
db.products.aggregate([
{
$project: {
name: 1,
category: 1,
discount: {
$switch: {
branches: [
{ case: { $eq: ["$category", "electronics"] }, then: 0.05 },
{ case: { $eq: ["$category", "books"] }, then: 0.10 },
{ case: { $eq: ["$category", "clothing"] }, then: 0.15 }
],
default: 0
}
}
}
}
])
Advanced Expression Examples
Complex Calculations
db.sales.aggregate([
{
$project: {
date: 1,
product: 1,
revenue: { $multiply: ["$price", "$quantity"] },
cost: { $multiply: ["$costPerUnit", "$quantity"] },
profit: {
$subtract: [
{ $multiply: ["$price", "$quantity"] },
{ $multiply: ["$costPerUnit", "$quantity"] }
]
},
profitMargin: {
$divide: [
{
$subtract: [
{ $multiply: ["$price", "$quantity"] },
{ $multiply: ["$costPerUnit", "$quantity"] }
]
},
{ $multiply: ["$price", "$quantity"] }
]
}
}
}
])
String Manipulation for Reporting
db.employees.aggregate([
{
$project: {
employeeId: 1,
formattedName: {
$concat: [
{ $toUpper: { $substr: ["$lastName", 0, 1] } },
{ $toLower: { $substr: ["$lastName", 1, { $subtract: [{ $strLenCP: "$lastName" }, 1] }] } },
", ",
{ $toUpper: { $substr: ["$firstName", 0, 1] } },
{ $toLower: { $substr: ["$firstName", 1, { $subtract: [{ $strLenCP: "$firstName" }, 1] }] } }
]
},
emailAddress: {
$concat: [
{ $toLower: { $substr: ["$firstName", 0, 1] } },
{ $toLower: "$lastName" },
"@company.com"
]
}
}
}
])
Real-World Applications
E-commerce Order Analysis
db.orders.aggregate([
{
$project: {
orderId: 1,
customer: 1,
orderDate: 1,
items: 1,
totalItems: { $size: "$items" },
subtotal: { $sum: "$items.price" },
tax: { $multiply: [{ $sum: "$items.price" }, 0.08] },
shipping: {
$cond: {
if: { $gte: [{ $sum: "$items.price" }, 100] },
then: 0,
else: 9.99
}
},
grandTotal: {
$add: [
{ $sum: "$items.price" },
{ $multiply: [{ $sum: "$items.price" }, 0.08] },
{
$cond: {
if: { $gte: [{ $sum: "$items.price" }, 100] },
then: 0,
else: 9.99
}
}
]
},
shippingStatus: {
$switch: {
branches: [
{ case: { $eq: ["$status", "shipped"] }, then: "On the way" },
{ case: { $eq: ["$status", "processing"] }, then: "Preparing to ship" },
{ case: { $eq: ["$status", "pending"] }, then: "Payment processing" }
],
default: "Status unknown"
}
},
expectedDelivery: {
$dateToString: {
format: "%Y-%m-%d",
date: {
$add: [
"$orderDate",
{ $multiply: [1000 * 60 * 60 * 24 * 5, 1] } // 5 days in milliseconds
]
}
}
}
}
}
])
Data Analytics - Customer Segmentation
db.customers.aggregate([
{
$project: {
name: 1,
email: 1,
totalSpent: 1,
orderCount: 1,
avgOrderValue: { $divide: ["$totalSpent", "$orderCount"] },
customerSegment: {
$switch: {
branches: [
{
case: {
$and: [
{ $gt: ["$totalSpent", 1000] },
{ $gt: ["$orderCount", 10] }
]
},
then: "VIP"
},
{
case: {
$or: [
{ $gt: ["$totalSpent", 500] },
{ $gt: ["$orderCount", 5] }
]
},
then: "Regular"
}
],
default: "New"
}
},
accountAge: {
$divide: [
{ $subtract: [new Date(), "$createdAt"] },
(1000 * 60 * 60 * 24) // Convert ms to days
]
},
loyaltyScore: {
$multiply: [
{ $divide: ["$totalSpent", 100] }, // Points per $100 spent
{
$add: [
1,
{
$multiply: [
0.1,
{ $divide: [{ $subtract: [new Date(), "$createdAt"] }, (1000 * 60 * 60 * 24 * 30)] } // 10% bonus per month
]
}
]
}
]
}
}
}
])
Expression Operators Reference
Here's a quick reference for some commonly used expression operators:
Arithmetic Operators
$add
: Addition$subtract
: Subtraction$multiply
: Multiplication$divide
: Division$mod
: Modulo
String Operators
$concat
: Concatenates strings$substr
: Returns a substring$toUpper
: Converts to uppercase$toLower
: Converts to lowercase$strLenCP
: Returns string length
Array Operators
$size
: Returns the array length$filter
: Selects a subset of an array$map
: Applies an expression to each item in an array$arrayElemAt
: Returns an element at a specified array index$concatArrays
: Concatenates arrays
Logical Operators
$and
: Logical AND$or
: Logical OR$not
: Logical NOT$cond
: Conditional expression
Comparison Operators
$eq
: Equal to$ne
: Not equal to$gt
: Greater than$gte
: Greater than or equal to$lt
: Less than$lte
: Less than or equal to
Date Operators
$year
: Extracts the year$month
: Extracts the month$dayOfMonth
: Extracts the day of the month$dayOfWeek
: Returns the day of the week (1-7)$dateToString
: Formats a date as a string
Best Practices for Aggregation Expressions
-
Use field references correctly: Always prefix field names with
$
when referring to document fields, e.g.,"$price"
. -
Be mindful of data types: MongoDB operators expect specific types. Use type conversion operators like
$toInt
,$toString
when needed. -
Handle missing fields: Use
$ifNull
to handle cases where fields might be missing. -
Break down complex expressions: For readability, use
$let
to define variables for complex calculations. -
Consider performance implications: Some expressions can be computationally expensive. Test with representative data volumes.
-
Check for empty arrays: Use
$size
and conditional operators to handle array operations safely.
Summary
MongoDB aggregation expressions are powerful tools for transforming and manipulating data within your aggregation pipelines. They allow you to:
- Perform mathematical calculations
- Manipulate strings and text
- Work with arrays and embedded documents
- Execute conditional logic
- Convert and format data
- Create complex derived fields
By mastering these expressions, you can transform your raw data into exactly the format you need for your application or analysis.
Additional Resources and Exercises
Exercises
-
Basic Math Exercise: Write an expression to calculate a 15% discount on products in a collection with price and quantity fields.
-
String Manipulation: Create an expression that formats user names from separate first and last name fields into the format "Last, First".
-
Array Processing: Write an expression that finds the most expensive item in an order's items array.
-
Conditional Logic: Create an expression that categorizes users by activity level, based on login count and last login date.
-
Advanced Challenge: Implement an aggregation pipeline that calculates the average order value per customer and ranks customers into percentiles.
Online Resources
- MongoDB Aggregation Pipeline Operators
- MongoDB University - Aggregation Framework
- MongoDB Developer Community Forums
Keep experimenting with expressions to unlock the full power of MongoDB's aggregation framework!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)