MongoDB Collections
Introduction
In MongoDB, a collection is a grouping of MongoDB documents. It's analogous to a table in relational databases, but unlike a table, a collection doesn't enforce a specific schema. Collections are where your data lives within MongoDB databases, giving you the flexibility to store different types of documents in the same collection.
This flexibility is one of MongoDB's key strengths as a NoSQL database. While relational databases require you to define your table structure before inserting data, MongoDB collections allow you to insert documents with varying fields, making them ideal for scenarios where data structure might evolve over time.
Collections Fundamentals
What is a Collection?
A collection is a container for MongoDB documents. Here are the key characteristics:
- Schema-less: Documents in a collection can have different fields
- Document Storage: Each item in a collection is a BSON document
- Unique Identifier: Every document requires a unique
_id
field - Organization: Collections help organize related documents together
Collection Naming Rules
When naming your collections, keep these rules in mind:
- Cannot be an empty string (
""
) - Cannot contain the null character (
\0
) - Cannot begin with the
system.
prefix (reserved for internal collections) - Cannot contain the
$
character - Should be less than 64 characters in length
Creating Collections
There are two ways to create collections in MongoDB:
1. Explicit Creation
You can explicitly create a collection using the createCollection()
method:
db.createCollection("customers")
Output:
{ "ok" : 1 }
2. Implicit Creation
MongoDB will automatically create a collection when you first insert a document:
db.products.insertOne({ name: "Laptop", price: 999, category: "Electronics" })
Output:
{
"acknowledged": true,
"insertedId": ObjectId("60a2f738d58b952e1cf0f7a9")
}
In this example, if the products
collection doesn't exist, MongoDB creates it automatically.
Working with Collections
Basic Collection Operations
Listing Collections
To see all collections in the current database:
db.getCollectionNames()
Output:
[ "customers", "products", "orders" ]
Dropping a Collection
To remove a collection and all its documents:
db.products.drop()
Output:
true
Collection Statistics
To get detailed information about a collection:
db.orders.stats()
Output:
{
"ns": "myStore.orders",
"count": 1423,
"size": 425690,
"avgObjSize": 299,
"storageSize": 557056,
"capped": false,
"wiredTiger": { ... },
"nindexes": 2,
"totalIndexSize": 131072,
"indexSizes": {
"_id_": 65536,
"order_date_1": 65536
},
"scaleFactor": 1,
"ok": 1
}
Collection Options and Configuration
When creating collections explicitly, you can specify various configuration options.
Capped Collections
Capped collections are fixed-size collections that maintain insertion order and automatically remove the oldest documents when the size limit is reached:
db.createCollection("logs", {
capped: true,
size: 10000000, // 10 MB
max: 10000 // maximum 10,000 documents
})
Capped collections are useful for:
- Logging applications
- Storing recent data like cache information
- Any scenario where you only care about the most recent entries
Validation Rules
You can enforce document validation rules on a collection:
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email", "status"],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
email: {
bsonType: "string",
pattern: "^.+@.+$",
description: "must be a valid email address"
},
status: {
enum: ["Active", "Inactive", "Pending"],
description: "can only be one of the enum values"
}
}
}
},
validationLevel: "moderate",
validationAction: "warn"
})
This example creates a collection with validation rules that:
- Requires
name
,email
, andstatus
fields - Validates that
email
contains an @ character - Ensures
status
is one of three allowed values
Collection Relationships and Design
In MongoDB, you need to carefully consider how to structure your collections based on your data relationships and access patterns.
Embedding vs. Referencing
MongoDB gives you two main options for modeling relationships:
- Embedded Documents: Store related data in a single document
- References: Store references between documents in different collections
Example of Embedding (One-to-Few)
When a user has a small number of addresses, embedding might be appropriate:
db.users.insertOne({
name: "John Smith",
email: "[email protected]",
addresses: [
{
type: "home",
street: "123 Main St",
city: "Boston",
state: "MA",
zip: "02101"
},
{
type: "work",
street: "456 Corporate Ave",
city: "Boston",
state: "MA",
zip: "02110"
}
]
})
Example of Referencing (One-to-Many)
For a blog with many posts per author, references might work better:
// Authors collection
db.authors.insertOne({
_id: ObjectId("60a2f738d58b952e1cf0f7b1"),
name: "Jane Doe",
bio: "Technology writer and researcher"
})
// Posts collection with reference to author
db.posts.insertOne({
title: "Introduction to MongoDB Collections",
content: "MongoDB collections are...",
author_id: ObjectId("60a2f738d58b952e1cf0f7b1"),
created_at: new Date()
})
Real-World Collection Design Examples
E-commerce Application
For an e-commerce site, you might have the following collections:
Sample product document:
{
_id: ObjectId("60a2f738d58b952e1cf0f7c1"),
name: "Wireless Headphones",
price: 79.99,
description: "Noise-cancelling wireless headphones with 20hr battery life",
category: "Electronics",
tags: ["audio", "wireless", "headphones"],
inventory: 45,
specifications: {
weight: "250g",
bluetooth: "5.0",
color: "Black"
},
reviews: [
{ user_id: ObjectId("60a2f738d58b952e1cf0f8a1"), rating: 5, comment: "Great sound quality!" },
{ user_id: ObjectId("60a2f738d58b952e1cf0f8a2"), rating: 4, comment: "Good battery life but a bit heavy." }
]
}
Content Management System
For a CMS, you might structure collections like this:
// Articles collection
db.articles.insertOne({
_id: ObjectId(),
title: "MongoDB Collection Best Practices",
slug: "mongodb-collection-best-practices",
content: "When designing MongoDB collections...",
author_id: ObjectId("60a2f738d58b952e1cf0f7d1"),
status: "published",
tags: ["mongodb", "database", "nosql"],
published_date: new Date(),
category_id: ObjectId("60a2f738d58b952e1cf0f7e1"),
comments_count: 5
})
// Comments stored in a separate collection
db.comments.insertOne({
_id: ObjectId(),
article_id: ObjectId("60a2f738d58b952e1cf0f7f1"),
user_id: ObjectId("60a2f738d58b952e1cf0f7d2"),
content: "Great article! Very helpful explanation of collection design.",
created_at: new Date(),
likes: 3
})
Best Practices for Collections
-
Design for your access patterns: Structure collections based on how you'll query and update data
-
Use descriptive collection names: Make collection names plural and descriptive (e.g.,
users
,products
) -
Limit embedded document size: Keep in mind the 16MB document size limit
-
Consider document growth: Allow for fields being added to documents over time
-
Create indexes: Add appropriate indexes to collections for better query performance
-
Use capped collections for logs and temporary data
-
Apply data validation: Use schema validation for critical collections
-
Balance embedding vs. referencing: Don't over-embed; use references when appropriate
Performance Considerations
Indexing
Indexes are essential for collection performance:
// Create an index on the email field for quick user lookups
db.users.createIndex({ email: 1 }, { unique: true })
// Create a compound index for order queries
db.orders.createIndex({ user_id: 1, order_date: -1 })
Collection Sharding
For large datasets, consider sharding your collections across multiple servers:
sh.enableSharding("ecommerce")
sh.shardCollection("ecommerce.orders", { order_date: 1 })
Summary
MongoDB collections are flexible containers for storing documents. Unlike tables in relational databases, collections don't enforce a strict schema, allowing for more versatile data modeling. Key points to remember:
- Collections store documents of varying structures
- You can create collections explicitly or implicitly
- Design collections based on your application's query patterns
- Choose between embedding related data or referencing across collections
- Apply validation rules when data consistency is important
- Use indexing for performance optimization
Collections are the foundation of MongoDB's document model, providing the perfect balance between structure and flexibility for modern application development.
Exercises
-
Create a collection called
inventory
with validation rules that requireitem_name
,quantity
, andcategory
fields. -
Insert five different products into the
inventory
collection with varying fields. -
Create an index on the
category
field of theinventory
collection. -
Design a collection structure for a social media application with users, posts, and comments.
-
Create a capped collection for storing system logs with a maximum size of 5MB.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)