Skip to main content

MySQL LOAD DATA

Introduction

When working with databases, importing large amounts of data efficiently is a common requirement. MySQL provides the powerful LOAD DATA statement specifically designed for this purpose. This feature allows you to quickly import data from text files (like CSV, TSV) directly into your database tables.

The LOAD DATA statement is significantly faster than using multiple INSERT statements because it:

  1. Minimizes the overhead of parsing SQL statements
  2. Reduces the number of network round trips between client and server
  3. Performs fewer index updates by batching operations

In this tutorial, we'll explore how to use LOAD DATA effectively to import data into your MySQL database tables.

Basic Syntax

The basic syntax of the LOAD DATA statement is as follows:

sql
LOAD DATA
[LOCAL]
INFILE 'file_name'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[PARTITION (partition_name,...)]
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[(col_name_or_user_var,...)]
[SET col_name = expr,...]

Don't worry about this intimidating syntax! We'll break it down into manageable parts and explain each component with examples.

Simple Example

Let's start with a simple example. Imagine we have a CSV file named employees.csv with the following content:

id,first_name,last_name,email,hire_date
1,John,Doe,[email protected],2020-01-15
2,Jane,Smith,[email protected],2020-03-20
3,Michael,Johnson,[email protected],2020-05-10

And a MySQL table with this structure:

sql
CREATE TABLE employees (
id INT PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
hire_date DATE
);

Here's how to import the data:

sql
LOAD DATA INFILE '/path/to/employees.csv'
INTO TABLE employees
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

This command:

  1. Specifies the file path
  2. Names the target table
  3. Defines field delimiter as comma
  4. Sets the quote character as double quotes
  5. Sets line terminator as newline character
  6. Skips the header row

Key Parameters Explained

LOCAL Keyword

sql
LOAD DATA LOCAL INFILE '/path/to/employees.csv' ...
  • With LOCAL: Reads the file from the client machine where you're running the MySQL client
  • Without LOCAL: Reads the file from the server machine where MySQL server is running

REPLACE vs IGNORE

sql
LOAD DATA INFILE '/path/to/file.csv' REPLACE INTO TABLE ...
  • REPLACE: If a row from the file has the same unique key value as an existing row, the existing row is deleted before inserting the new row
  • IGNORE: Rows that duplicate an existing row on a unique key value are skipped

Field Handling Options

sql
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
  • TERMINATED BY: Specifies the character that separates fields (default is tab \t)
  • ENCLOSED BY: Character that encloses field values (useful for text containing delimiters)
  • ESCAPED BY: Character that escapes special characters (default is backslash \)

Line Handling Options

sql
LINES TERMINATED BY '\n'
STARTING BY ''
  • TERMINATED BY: Character(s) that mark the end of a line (default is newline \n)
  • STARTING BY: Character(s) that should appear at the start of each line

Skipping Rows

sql
IGNORE 1 ROWS

Skips the specified number of rows at the beginning of the file (often used to skip header rows)

Advanced Usage

Selective Column Import

You can specify which columns to import and in what order:

sql
LOAD DATA INFILE '/path/to/employees.csv' 
INTO TABLE employees
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(first_name, last_name, email, @dummy, hire_date);

This imports only the specified columns and skips the column represented by @dummy.

Column Transformations with SET

You can transform data during import:

sql
LOAD DATA INFILE '/path/to/employees.csv' 
INTO TABLE employees
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(first_name, last_name, @email, hire_date)
SET email = LOWER(@email);

This converts the email to lowercase during import.

Handling NULL Values

To handle NULL values in your input file:

sql
LOAD DATA INFILE '/path/to/employees.csv' 
INTO TABLE employees
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@id, @first_name, @last_name, @email, @hire_date)
SET
id = NULLIF(@id, ''),
first_name = NULLIF(@first_name, ''),
last_name = NULLIF(@last_name, ''),
email = NULLIF(@email, ''),
hire_date = NULLIF(@hire_date, '');

The NULLIF() function compares the value with an empty string and returns NULL if they match.

Practical Examples

Example 1: Importing Transaction Data

Let's import transaction data from a tab-delimited file:

sql
LOAD DATA INFILE '/path/to/transactions.tsv' 
INTO TABLE transactions
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(transaction_id, customer_id, transaction_date, amount, @status)
SET
status = UPPER(@status),
imported_at = NOW();

Example 2: Conditional Data Import

Import data with conditional processing:

sql
LOAD DATA INFILE '/path/to/products.csv' 
INTO TABLE products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@id, @name, @price, @category)
SET
id = @id,
name = @name,
price = @price,
category = @category,
price_tier = CASE
WHEN @price < 10 THEN 'budget'
WHEN @price BETWEEN 10 AND 50 THEN 'standard'
ELSE 'premium'
END;

Example 3: Importing Date/Time Values

Different formats of dates can be handled:

sql
LOAD DATA INFILE '/path/to/events.csv' 
INTO TABLE events
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@event_id, @event_name, @event_date, @event_time)
SET
event_id = @event_id,
event_name = @event_name,
event_datetime = STR_TO_DATE(CONCAT(@event_date, ' ', @event_time), '%m/%d/%Y %H:%i:%s');

Performance Considerations

To achieve optimal performance with LOAD DATA:

  1. Disable Indexes: For very large imports, consider disabling indexes before importing:

    sql
    ALTER TABLE your_table DISABLE KEYS;
    -- LOAD DATA statements
    ALTER TABLE your_table ENABLE KEYS;
  2. Transaction Control: Wrap multiple LOAD DATA statements in a transaction:

    sql
    START TRANSACTION;
    -- LOAD DATA statements
    COMMIT;
  3. Server Variables: Adjust these for large imports:

    sql
    SET GLOBAL innodb_buffer_pool_size = 1073741824; -- 1GB
    SET GLOBAL max_allowed_packet = 67108864; -- 64MB

Security Considerations

  1. File Permissions: The MySQL server needs appropriate permissions to read the files.

  2. LOCAL Keyword: Using LOCAL requires:

    • The MySQL server to be configured with local_infile=1
    • The client to use --local-infile=1 when connecting
  3. Secure Paths: Restrict file access to only necessary directories using the secure_file_priv system variable.

Troubleshooting Common Issues

"The MySQL server is running with the --secure-file-priv option"

Solution:

sql
SHOW VARIABLES LIKE 'secure_file_priv';

Move your file to the directory shown in the result or adjust the MySQL configuration.

"ERROR 1148: The used command is not allowed with this MySQL version"

Solution: Enable local_infile:

sql
SET GLOBAL local_infile = 1;

Then reconnect with: mysql --local-infile=1 -u username -p

"ERROR 1290: The MySQL server is running with the --secure-file-priv option"

Solution: Check the secure file directory:

sql
SHOW VARIABLES LIKE 'secure_file_priv';

Place your files in this directory or change the MySQL configuration.

Summary

The LOAD DATA statement is a powerful tool for efficiently importing data into MySQL. Key points to remember:

  • It's much faster than multiple INSERT statements for bulk data
  • You can transform data during import using variables and the SET clause
  • Field and line terminators can be customized for different file formats
  • Security considerations are important, especially when using the LOCAL keyword
  • Performance can be optimized with proper server configuration and index handling

With the knowledge from this tutorial, you should be able to efficiently import large datasets into your MySQL database, saving time and resources in your data processing workflows.

Exercises

  1. Create a CSV file with sample customer data and import it into a new table using LOAD DATA.
  2. Import a file with dates in MM/DD/YYYY format and convert them to MySQL's date format during import.
  3. Write a LOAD DATA statement that calculates a discount field (10% off) during import from a products file.
  4. Import a file containing nullable fields and ensure empty values are stored as NULL in the database.
  5. Create a transaction log import that skips duplicate transaction IDs.

Additional Resources

By mastering the LOAD DATA statement, you'll have a valuable tool in your MySQL toolkit that can save considerable time when working with large datasets.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)