Python CSV Files
CSV (Comma Separated Values) files are one of the most common formats for storing and exchanging tabular data. They're simple text files where each line represents a row of data, and values within a row are separated by commas (or sometimes other delimiters).
In this tutorial, you'll learn how to work with CSV files in Python, including how to read, write, and manipulate CSV data using Python's built-in csv
module.
Introduction to CSV Files
CSV files are popular because:
- They're simple and human-readable
- Most spreadsheet programs and databases can import and export them
- They require minimal processing to parse
- They're universally supported across different platforms and systems
A typical CSV file might look like this:
name,age,city
John,28,New York
Alice,24,San Francisco
Bob,32,Chicago
The Python csv
Module
Python provides a built-in csv
module that makes it easy to work with CSV files. This module implements classes to read and write tabular data in CSV format.
Let's import the module to get started:
import csv
Reading CSV Files
Basic CSV Reading
To read a CSV file in Python, we use the csv.reader()
function:
import csv
# Open the CSV file
with open('data.csv', 'r') as file:
# Create a CSV reader object
csv_reader = csv.reader(file)
# Read and print each row
for row in csv_reader:
print(row)
If we have a CSV file named data.csv
with the content shown earlier, the output would be:
['name', 'age', 'city']
['John', '28', 'New York']
['Alice', '24', 'San Francisco']
['Bob', '32', 'Chicago']
Notice that each row is returned as a list of strings.
Reading CSV Files with Headers
Often, the first row of a CSV file contains headers. We can use the csv.DictReader
class to automatically map the values to their corresponding headers:
import csv
with open('data.csv', 'r') as file:
# Create a dictionary reader object
csv_reader = csv.DictReader(file)
# Read and print each row
for row in csv_reader:
print(f"{row['name']} is {row['age']} years old and lives in {row['city']}")
Output:
John is 28 years old and lives in New York
Alice is 24 years old and lives in San Francisco
Bob is 32 years old and lives in Chicago
Handling Different Delimiters
Sometimes CSV files use delimiters other than commas, such as semicolons, tabs, or pipes. You can specify a different delimiter like this:
import csv
with open('data.tsv', 'r') as file:
# Specify tab as the delimiter
csv_reader = csv.reader(file, delimiter='\t')
for row in csv_reader:
print(row)
Writing CSV Files
Basic CSV Writing
To write data to a CSV file, we use the csv.writer()
function:
import csv
# Data to write
data = [
['Name', 'Age', 'City'],
['John', 28, 'New York'],
['Alice', 24, 'San Francisco'],
['Bob', 32, 'Chicago']
]
# Open a file for writing
with open('new_data.csv', 'w', newline='') as file:
# Create a CSV writer object
csv_writer = csv.writer(file)
# Write data rows
for row in data:
csv_writer.writerow(row)
The newline=''
parameter is important, especially on Windows, to ensure consistent line endings across different platforms.
Writing Multiple Rows at Once
Instead of writing one row at a time, you can write all rows at once using writerows()
:
import csv
# Data to write
data = [
['Name', 'Age', 'City'],
['John', 28, 'New York'],
['Alice', 24, 'San Francisco'],
['Bob', 32, 'Chicago']
]
with open('new_data.csv', 'w', newline='') as file:
csv_writer = csv.writer(file)
# Write all rows at once
csv_writer.writerows(data)
Writing Dictionary Data
If your data is stored in dictionaries, you can use the csv.DictWriter
class:
import csv
# Dictionary data
data = [
{'Name': 'John', 'Age': 28, 'City': 'New York'},
{'Name': 'Alice', 'Age': 24, 'City': 'San Francisco'},
{'Name': 'Bob', 'Age': 32, 'City': 'Chicago'}
]
# Field names (column headers)
field_names = ['Name', 'Age', 'City']
with open('dict_data.csv', 'w', newline='') as file:
# Create a dictionary writer
writer = csv.DictWriter(file, fieldnames=field_names)
# Write the header row
writer.writeheader()
# Write the data rows
writer.writerows(data)
Practical Examples
Example 1: Analyzing Sales Data
Let's say we have a CSV file containing sales data and we want to calculate total sales per product:
import csv
# Create a dictionary to store product sales
product_sales = {}
with open('sales_data.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
product = row['Product']
quantity = int(row['Quantity'])
price = float(row['Price'])
# Calculate sales for this transaction
sale_amount = quantity * price
# Add to the product's total
if product in product_sales:
product_sales[product] += sale_amount
else:
product_sales[product] = sale_amount
# Print the results
print("Total sales by product:")
for product, total in product_sales.items():
print(f"{product}: ${total:.2f}")
For a sales_data.csv file like this:
Product,Quantity,Price
Laptop,2,1200.00
Monitor,3,300.00
Keyboard,5,50.00
Laptop,1,1200.00
Mouse,10,25.00
The output would be:
Total sales by product:
Laptop: $3600.00
Monitor: $900.00
Keyboard: $250.00
Mouse: $250.00
Example 2: Data Transformation
Let's convert temperature data from Celsius to Fahrenheit and write the results to a new CSV file:
import csv
def celsius_to_fahrenheit(celsius):
return (celsius * 9/5) + 32
with open('temperatures_celsius.csv', 'r') as input_file:
reader = csv.DictReader(input_file)
# Prepare output file
with open('temperatures_fahrenheit.csv', 'w', newline='') as output_file:
# Create fieldnames for output file
fieldnames = ['Date', 'LocationID', 'Temperature (F)']
writer = csv.DictWriter(output_file, fieldnames=fieldnames)
# Write the header
writer.writeheader()
# Process each row
for row in reader:
# Convert temperature from Celsius to Fahrenheit
celsius = float(row['Temperature (C)'])
fahrenheit = celsius_to_fahrenheit(celsius)
# Write the transformed row
writer.writerow({
'Date': row['Date'],
'LocationID': row['LocationID'],
'Temperature (F)': round(fahrenheit, 1)
})
print("Temperature conversion complete!")
Advanced CSV Operations
Working with CSV Dialects
CSV files can have different formats depending on their source. The csv
module allows you to define dialects to handle these variations:
import csv
# Register a new dialect
csv.register_dialect('custom',
delimiter=';',
quotechar='"',
escapechar='\\',
doublequote=False,
quoting=csv.QUOTE_MINIMAL)
# Reading using the custom dialect
with open('custom_data.csv', 'r') as file:
reader = csv.reader(file, dialect='custom')
for row in reader:
print(row)
Handling Quoted Values
CSV files often contain quoted values, especially when the data includes commas. The csv
module handles these automatically:
import csv
# Example with quoted values
data = [
['Name', 'Description'],
['Product A', 'Basic, affordable option'],
['Product B', 'Premium, "deluxe" version with extras']
]
with open('quoted_data.csv', 'w', newline='') as file:
writer = csv.writer(file, quoting=csv.QUOTE_MINIMAL)
writer.writerows(data)
# Reading back the file
with open('quoted_data.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row)
Error Handling When Working with CSV Files
When working with CSV files, various errors can occur, such as file not found, permission issues, or malformed data. Here's how to implement robust error handling:
import csv
import os
def read_csv_safely(file_path):
try:
# Check if file exists
if not os.path.exists(file_path):
print(f"Error: File '{file_path}' does not exist.")
return None
# Try to open and read the file
with open(file_path, 'r') as file:
try:
reader = csv.reader(file)
data = list(reader)
return data
except csv.Error as e:
print(f"CSV parsing error: {e}")
return None
except PermissionError:
print(f"Error: No permission to read '{file_path}'.")
return None
except Exception as e:
print(f"Unexpected error: {e}")
return None
# Example usage
data = read_csv_safely('sample.csv')
if data:
print(f"Successfully read {len(data)} rows from the CSV file.")
Summary
In this tutorial, you learned how to:
- Read data from CSV files using
csv.reader
andcsv.DictReader
- Write data to CSV files using
csv.writer
andcsv.DictWriter
- Handle different CSV formats and dialects
- Process and transform CSV data
- Implement error handling for CSV operations
CSV files are an essential format for data exchange, and Python's csv
module provides powerful tools to work with them effectively. Whether you're analyzing data, generating reports, or migrating information between systems, these skills will be valuable in your Python programming journey.
Additional Resources
- Python's CSV Module Documentation
- Real Python: Reading and Writing CSV Files
- Pandas: A more powerful library for data manipulation
Exercises
- Create a CSV file with student grades and calculate the average grade for each student
- Read a CSV file containing product information and filter products based on specific criteria
- Merge two CSV files based on a common identifier
- Convert a CSV file to JSON format
- Create a program that reads a large CSV file (1000+ rows) and performs statistical analysis on numeric columns
Happy coding!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)