MongoDB mongoimport
Introduction
When working with MongoDB databases, you often need to import data from external sources. MongoDB provides a powerful command-line tool called mongoimport
that allows you to import data from JSON, CSV, and TSV files directly into MongoDB collections. This utility is especially useful when migrating data between systems or loading initial datasets into your MongoDB database.
In this tutorial, we'll explore the mongoimport
tool in detail, covering its syntax, options, and practical applications with examples.
Prerequisites
Before diving into mongoimport, make sure you have:
- MongoDB installed on your system
- MongoDB Database Tools installed (mongoimport is part of this package)
- Basic knowledge of MongoDB and its terminology
- Access to a MongoDB database
Basic mongoimport Syntax
The basic syntax for the mongoimport
command is:
mongoimport --uri="mongodb://hostname:port/database" --collection=name [options] [file]
Where:
--uri
specifies the connection string to your MongoDB instance--collection
specifies the collection where data will be imported[options]
are additional parameters that control the import process[file]
is the path to the file you want to import (can be omitted if using stdin)
Importing Different File Formats
Importing JSON Data
JSON is the most natural format for MongoDB since MongoDB stores documents in BSON (Binary JSON).
Single JSON document
Let's say we have a file called employee.json
with the following content:
{
"name": "John Doe",
"position": "Software Engineer",
"department": "Engineering",
"salary": 75000,
"skills": ["JavaScript", "Node.js", "MongoDB"]
}
To import this document:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employee.json
Output:
2023-10-20T14:32:15.123+0000 connected to: mongodb://localhost:27017/company
2023-10-20T14:32:15.148+0000 1 document imported successfully.
JSON Array
For a JSON array containing multiple documents, let's use employees.json
:
[
{
"name": "John Doe",
"position": "Software Engineer",
"department": "Engineering",
"salary": 75000
},
{
"name": "Jane Smith",
"position": "Product Manager",
"department": "Product",
"salary": 85000
}
]
The import command remains the same:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employees.json
Output:
2023-10-20T14:35:22.456+0000 connected to: mongodb://localhost:27017/company
2023-10-20T14:35:22.489+0000 2 documents imported successfully.
JSON Lines (JSONL) / Newline-Delimited JSON (NDJSON)
For large datasets, using JSONL format (one document per line) is more efficient. Example employees.jsonl
:
{"name": "John Doe", "position": "Software Engineer", "department": "Engineering", "salary": 75000}
{"name": "Jane Smith", "position": "Product Manager", "department": "Product", "salary": 85000}
{"name": "Mike Johnson", "position": "Data Scientist", "department": "Data", "salary": 82000}
To import:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employees.jsonl
Importing CSV Data
For CSV data, we need to specify field names or use the header row of the CSV file. Let's import employees.csv
:
name,position,department,salary
John Doe,Software Engineer,Engineering,75000
Jane Smith,Product Manager,Product,85000
Mike Johnson,Data Scientist,Data,82000
To import this CSV file:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --headerline employees.csv
Output:
2023-10-20T14:40:15.789+0000 connected to: mongodb://localhost:27017/company
2023-10-20T14:40:15.823+0000 3 documents imported successfully.
If your CSV file doesn't have a header row, you can specify the fields manually:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --fields="name,position,department,salary" employees_no_header.csv
Importing TSV Data
TSV (Tab-Separated Values) works similarly to CSV but uses tabs as separators:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=tsv --headerline employees.tsv
Important Options and Features
Handling Data Types
By default, mongoimport
tries to infer the data types from the import file. For CSV and TSV files, you can explicitly specify types using the --columnsHaveTypes
option:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --headerline --columnsHaveTypes --fields="name.string(),age.int32(),salary.double(),hired.date(2006-01-02)" employees_typed.csv
Dropping Collections Before Import
To replace an existing collection with the imported data:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --drop employees.json
Handling Duplicates and Updates
When importing documents that might already exist in the collection, you have several options:
- Insert only (default behavior)
- Upsert (update if exists, insert if not)
- Merge with existing documents
Using Upsert
To upsert documents based on a unique field:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --upsert --upsertFields=employeeId employees.json
This will update existing documents where the employeeId
matches and insert new documents otherwise.
Importing to a Specific Database and Collection
If you're not using the URI format, you can specify the database and collection separately:
mongoimport --host localhost --port 27017 --db company --collection employees employees.json
Authentication
For databases requiring authentication:
mongoimport --uri="mongodb://username:password@localhost:27017/company" --collection=employees employees.json
Or using separate parameters:
mongoimport --host localhost --port 27017 --db company --collection employees --username admin --password "securepassword" --authenticationDatabase admin employees.json
Importing Large Datasets
For large files, you can adjust the number of documents processed in parallel:
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --numInsertionWorkers=4 large_employees.json
Real-World Examples
Example 1: Importing Customer Data
Let's say you have customer data from your e-commerce platform in a CSV file and want to import it into MongoDB.
customers.csv
:
id,name,email,registered_date,total_orders,lifetime_value
1001,Alice Brown,[email protected],2022-01-15,12,543.21
1002,Bob Williams,[email protected],2022-02-03,5,129.99
1003,Charlie Davis,[email protected],2022-02-28,8,287.45
Import command:
mongoimport --uri="mongodb://localhost:27017/ecommerce" --collection=customers --type=csv --headerline --columnsHaveTypes --fields="id.int32(),name.string(),email.string(),registered_date.date(2006-01-02),total_orders.int32(),lifetime_value.double()" customers.csv
Example 2: Daily Data Import Automation
Here's a practical bash script that could be used to automate daily data imports:
#!/bin/bash
# daily_import.sh
TODAY=$(date +"%Y-%m-%d")
FILE_PATH="/path/to/daily_exports/data_${TODAY}.json"
if [ -f "$FILE_PATH" ]; then
echo "Importing data for $TODAY..."
mongoimport --uri="mongodb://localhost:27017/analytics" \
--collection=daily_metrics \
--file="$FILE_PATH" \
--upsert \
--upsertFields="date,metric_id"
echo "Import completed."
else
echo "No data file found for $TODAY."
fi
You could schedule this script with cron to run automatically each day.
Example 3: Importing Nested Data
Let's say we have product data with nested categories and attributes in JSON format:
[
{
"sku": "P12345",
"name": "Wireless Headphones",
"price": 89.99,
"categories": ["Electronics", "Audio"],
"specs": {
"color": "Black",
"weight": 0.25,
"wireless": true,
"batteryLife": "20 hours"
},
"stock": [
{"location": "Warehouse A", "count": 120},
{"location": "Warehouse B", "count": 85}
]
},
{
"sku": "P67890",
"name": "Bluetooth Speaker",
"price": 59.99,
"categories": ["Electronics", "Audio", "Portable"],
"specs": {
"color": "Blue",
"weight": 0.5,
"wireless": true,
"batteryLife": "12 hours"
},
"stock": [
{"location": "Warehouse A", "count": 200},
{"location": "Warehouse C", "count": 150}
]
}
]
To import this complex JSON structure:
mongoimport --uri="mongodb://localhost:27017/inventory" --collection=products products.json
MongoDB will preserve the entire document structure with nested objects and arrays.
Common Issues and Solutions
Issue 1: Invalid JSON
If you encounter errors about invalid JSON, check that your JSON file is properly formatted. You can use tools like jq
to validate:
jq . employees.json > /dev/null && echo "Valid JSON" || echo "Invalid JSON"
Issue 2: CSV Field Mapping
If your CSV import produces unexpected results, verify your field mappings and data types. You might need to adjust the --fields
parameter to match your data structure.
Issue 3: Unicode or Special Characters
For files with Unicode or special characters, ensure they are encoded properly (UTF-8 is recommended) and use the --numDecodingWorkers
option to help with processing:
mongoimport --uri="mongodb://localhost:27017/company" --collection=international_employees --numDecodingWorkers=4 international_data.json
Summary
MongoDB's mongoimport
tool is a powerful utility for importing data into MongoDB collections from various file formats. Here's a quick recap of what we've covered:
- Basic mongoimport syntax and options
- Importing JSON, CSV, and TSV files
- Handling different data types and complex structures
- Managing duplicates with upsert options
- Authentication and connection parameters
- Performance optimization for large imports
- Real-world examples and automation techniques
With mongoimport
, you can efficiently load data into MongoDB as part of your data migration, ETL processes, or initial database setup.
Additional Resources and Exercises
Exercises
- Import a CSV file with 5 columns into a new MongoDB collection with specific data types for each column.
- Create a JSON file with an array of at least 3 complex documents (containing nested objects and arrays) and import it into MongoDB.
- Write a script that extracts data from a SQL database, converts it to JSON, and imports it into MongoDB using mongoimport.
Further Learning
- Explore the companion tool
mongoexport
for exporting data from MongoDB - Learn about MongoDB Atlas Data Migration service for cloud-based migrations
- Study the MongoDB Aggregation Framework for transforming data after import
By mastering mongoimport
, you've added a valuable skill to your MongoDB toolset that will help you efficiently manage data across your applications and systems.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)