Skip to main content

MongoDB mongoimport

Introduction

When working with MongoDB databases, you often need to import data from external sources. MongoDB provides a powerful command-line tool called mongoimport that allows you to import data from JSON, CSV, and TSV files directly into MongoDB collections. This utility is especially useful when migrating data between systems or loading initial datasets into your MongoDB database.

In this tutorial, we'll explore the mongoimport tool in detail, covering its syntax, options, and practical applications with examples.

Prerequisites

Before diving into mongoimport, make sure you have:

  • MongoDB installed on your system
  • MongoDB Database Tools installed (mongoimport is part of this package)
  • Basic knowledge of MongoDB and its terminology
  • Access to a MongoDB database

Basic mongoimport Syntax

The basic syntax for the mongoimport command is:

bash
mongoimport --uri="mongodb://hostname:port/database" --collection=name [options] [file]

Where:

  • --uri specifies the connection string to your MongoDB instance
  • --collection specifies the collection where data will be imported
  • [options] are additional parameters that control the import process
  • [file] is the path to the file you want to import (can be omitted if using stdin)

Importing Different File Formats

Importing JSON Data

JSON is the most natural format for MongoDB since MongoDB stores documents in BSON (Binary JSON).

Single JSON document

Let's say we have a file called employee.json with the following content:

json
{
"name": "John Doe",
"position": "Software Engineer",
"department": "Engineering",
"salary": 75000,
"skills": ["JavaScript", "Node.js", "MongoDB"]
}

To import this document:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employee.json

Output:

2023-10-20T14:32:15.123+0000	connected to: mongodb://localhost:27017/company
2023-10-20T14:32:15.148+0000 1 document imported successfully.

JSON Array

For a JSON array containing multiple documents, let's use employees.json:

json
[
{
"name": "John Doe",
"position": "Software Engineer",
"department": "Engineering",
"salary": 75000
},
{
"name": "Jane Smith",
"position": "Product Manager",
"department": "Product",
"salary": 85000
}
]

The import command remains the same:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employees.json

Output:

2023-10-20T14:35:22.456+0000	connected to: mongodb://localhost:27017/company
2023-10-20T14:35:22.489+0000 2 documents imported successfully.

JSON Lines (JSONL) / Newline-Delimited JSON (NDJSON)

For large datasets, using JSONL format (one document per line) is more efficient. Example employees.jsonl:

{"name": "John Doe", "position": "Software Engineer", "department": "Engineering", "salary": 75000}
{"name": "Jane Smith", "position": "Product Manager", "department": "Product", "salary": 85000}
{"name": "Mike Johnson", "position": "Data Scientist", "department": "Data", "salary": 82000}

To import:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees employees.jsonl

Importing CSV Data

For CSV data, we need to specify field names or use the header row of the CSV file. Let's import employees.csv:

name,position,department,salary
John Doe,Software Engineer,Engineering,75000
Jane Smith,Product Manager,Product,85000
Mike Johnson,Data Scientist,Data,82000

To import this CSV file:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --headerline employees.csv

Output:

2023-10-20T14:40:15.789+0000	connected to: mongodb://localhost:27017/company
2023-10-20T14:40:15.823+0000 3 documents imported successfully.

If your CSV file doesn't have a header row, you can specify the fields manually:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --fields="name,position,department,salary" employees_no_header.csv

Importing TSV Data

TSV (Tab-Separated Values) works similarly to CSV but uses tabs as separators:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=tsv --headerline employees.tsv

Important Options and Features

Handling Data Types

By default, mongoimport tries to infer the data types from the import file. For CSV and TSV files, you can explicitly specify types using the --columnsHaveTypes option:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --type=csv --headerline --columnsHaveTypes --fields="name.string(),age.int32(),salary.double(),hired.date(2006-01-02)" employees_typed.csv

Dropping Collections Before Import

To replace an existing collection with the imported data:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --drop employees.json

Handling Duplicates and Updates

When importing documents that might already exist in the collection, you have several options:

  1. Insert only (default behavior)
  2. Upsert (update if exists, insert if not)
  3. Merge with existing documents

Using Upsert

To upsert documents based on a unique field:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --upsert --upsertFields=employeeId employees.json

This will update existing documents where the employeeId matches and insert new documents otherwise.

Importing to a Specific Database and Collection

If you're not using the URI format, you can specify the database and collection separately:

bash
mongoimport --host localhost --port 27017 --db company --collection employees employees.json

Authentication

For databases requiring authentication:

bash
mongoimport --uri="mongodb://username:password@localhost:27017/company" --collection=employees employees.json

Or using separate parameters:

bash
mongoimport --host localhost --port 27017 --db company --collection employees --username admin --password "securepassword" --authenticationDatabase admin employees.json

Importing Large Datasets

For large files, you can adjust the number of documents processed in parallel:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=employees --numInsertionWorkers=4 large_employees.json

Real-World Examples

Example 1: Importing Customer Data

Let's say you have customer data from your e-commerce platform in a CSV file and want to import it into MongoDB.

customers.csv:

id,name,email,registered_date,total_orders,lifetime_value
1001,Alice Brown,[email protected],2022-01-15,12,543.21
1002,Bob Williams,[email protected],2022-02-03,5,129.99
1003,Charlie Davis,[email protected],2022-02-28,8,287.45

Import command:

bash
mongoimport --uri="mongodb://localhost:27017/ecommerce" --collection=customers --type=csv --headerline --columnsHaveTypes --fields="id.int32(),name.string(),email.string(),registered_date.date(2006-01-02),total_orders.int32(),lifetime_value.double()" customers.csv

Example 2: Daily Data Import Automation

Here's a practical bash script that could be used to automate daily data imports:

bash
#!/bin/bash
# daily_import.sh

TODAY=$(date +"%Y-%m-%d")
FILE_PATH="/path/to/daily_exports/data_${TODAY}.json"

if [ -f "$FILE_PATH" ]; then
echo "Importing data for $TODAY..."
mongoimport --uri="mongodb://localhost:27017/analytics" \
--collection=daily_metrics \
--file="$FILE_PATH" \
--upsert \
--upsertFields="date,metric_id"

echo "Import completed."
else
echo "No data file found for $TODAY."
fi

You could schedule this script with cron to run automatically each day.

Example 3: Importing Nested Data

Let's say we have product data with nested categories and attributes in JSON format:

json
[
{
"sku": "P12345",
"name": "Wireless Headphones",
"price": 89.99,
"categories": ["Electronics", "Audio"],
"specs": {
"color": "Black",
"weight": 0.25,
"wireless": true,
"batteryLife": "20 hours"
},
"stock": [
{"location": "Warehouse A", "count": 120},
{"location": "Warehouse B", "count": 85}
]
},
{
"sku": "P67890",
"name": "Bluetooth Speaker",
"price": 59.99,
"categories": ["Electronics", "Audio", "Portable"],
"specs": {
"color": "Blue",
"weight": 0.5,
"wireless": true,
"batteryLife": "12 hours"
},
"stock": [
{"location": "Warehouse A", "count": 200},
{"location": "Warehouse C", "count": 150}
]
}
]

To import this complex JSON structure:

bash
mongoimport --uri="mongodb://localhost:27017/inventory" --collection=products products.json

MongoDB will preserve the entire document structure with nested objects and arrays.

Common Issues and Solutions

Issue 1: Invalid JSON

If you encounter errors about invalid JSON, check that your JSON file is properly formatted. You can use tools like jq to validate:

bash
jq . employees.json > /dev/null && echo "Valid JSON" || echo "Invalid JSON"

Issue 2: CSV Field Mapping

If your CSV import produces unexpected results, verify your field mappings and data types. You might need to adjust the --fields parameter to match your data structure.

Issue 3: Unicode or Special Characters

For files with Unicode or special characters, ensure they are encoded properly (UTF-8 is recommended) and use the --numDecodingWorkers option to help with processing:

bash
mongoimport --uri="mongodb://localhost:27017/company" --collection=international_employees --numDecodingWorkers=4 international_data.json

Summary

MongoDB's mongoimport tool is a powerful utility for importing data into MongoDB collections from various file formats. Here's a quick recap of what we've covered:

  • Basic mongoimport syntax and options
  • Importing JSON, CSV, and TSV files
  • Handling different data types and complex structures
  • Managing duplicates with upsert options
  • Authentication and connection parameters
  • Performance optimization for large imports
  • Real-world examples and automation techniques

With mongoimport, you can efficiently load data into MongoDB as part of your data migration, ETL processes, or initial database setup.

Additional Resources and Exercises

Exercises

  1. Import a CSV file with 5 columns into a new MongoDB collection with specific data types for each column.
  2. Create a JSON file with an array of at least 3 complex documents (containing nested objects and arrays) and import it into MongoDB.
  3. Write a script that extracts data from a SQL database, converts it to JSON, and imports it into MongoDB using mongoimport.

Further Learning

  • Explore the companion tool mongoexport for exporting data from MongoDB
  • Learn about MongoDB Atlas Data Migration service for cloud-based migrations
  • Study the MongoDB Aggregation Framework for transforming data after import

By mastering mongoimport, you've added a valuable skill to your MongoDB toolset that will help you efficiently manage data across your applications and systems.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)