Skip to main content

Pandas Excel Export

Introduction

When working with data in Python, exporting your processed data to Excel is a common requirement. Excel remains one of the most widely used tools for data analysis and sharing across organizations. Pandas, Python's powerful data manipulation library, provides robust functionality to export DataFrame objects to Excel files with various customization options.

In this tutorial, you'll learn how to export your Pandas DataFrames to Excel files, apply formatting, work with multiple sheets, and handle various export configurations to create professional spreadsheets.

Prerequisites

Before we dive into exporting to Excel, make sure you have the following libraries installed:

python
# Install required libraries
# pip install pandas openpyxl xlsxwriter

You'll need either openpyxl or xlsxwriter as the Excel engine. Both have their advantages, but we'll primarily use openpyxl in this tutorial.

Basic Excel Export with Pandas

Let's start with the simplest way to export a DataFrame to Excel:

python
import pandas as pd

# Create a sample DataFrame
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 34, 29, 42],
'City': ['New York', 'Paris', 'Berlin', 'London'],
'Salary': [75000, 80000, 65000, 90000]
}

df = pd.DataFrame(data)

# Basic export to Excel
df.to_excel('employees.xlsx', index=False)

This creates a file named 'employees.xlsx' with our data. The index=False parameter prevents the DataFrame index from being included in the Excel file, which is often what you want unless the index contains meaningful information.

Configuring the Excel Export

The to_excel() method offers many customization options:

python
# Export with more configuration options
df.to_excel(
'employees_config.xlsx',
sheet_name='Employee Data', # Custom sheet name
index=False, # Don't include index
freeze_panes=(1, 0), # Freeze the first row
engine='openpyxl' # Specify the Excel engine
)

Here's what each parameter does:

  • sheet_name: Sets the name of the worksheet
  • freeze_panes: Freezes rows and columns for better navigation of large datasets
  • engine: Specifies which Excel writer to use ('openpyxl' or 'xlsxwriter')

Working with Excel Writers for More Control

For more advanced formatting, you can use Excel writers explicitly:

python
# Create an Excel writer
with pd.ExcelWriter('employees_formatted.xlsx', engine='openpyxl') as writer:
# Write the DataFrame to Excel
df.to_excel(writer, sheet_name='Employees', index=False)

# Get the xlsxwriter worksheet object
worksheet = writer.sheets['Employees']

# Get the dimensions of the DataFrame
(max_row, max_col) = df.shape

# Set column widths
for i, col in enumerate(df.columns):
column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
worksheet.column_dimensions[chr(65 + i)].width = column_len

This approach gives you access to the underlying Excel worksheet, allowing you to adjust column widths to fit your data better.

Exporting Multiple DataFrames to Different Sheets

You can export multiple DataFrames to different worksheets in the same Excel file:

python
import pandas as pd

# Create sample DataFrames
employees_df = pd.DataFrame({
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 34, 29, 42],
'Department': ['IT', 'HR', 'IT', 'Finance']
})

salaries_df = pd.DataFrame({
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Salary': [75000, 80000, 65000, 90000],
'Bonus': [5000, 6000, 3000, 10000]
})

# Export to multiple sheets
with pd.ExcelWriter('company_data.xlsx', engine='openpyxl') as writer:
employees_df.to_excel(writer, sheet_name='Employees', index=False)
salaries_df.to_excel(writer, sheet_name='Salaries', index=False)

This creates an Excel file with two sheets: 'Employees' and 'Salaries'.

Advanced Formatting with XlsxWriter

If you need more advanced formatting, XlsxWriter is a great option:

python
import pandas as pd

# Create a sample DataFrame
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 34, 29, 42],
'City': ['New York', 'Paris', 'Berlin', 'London'],
'Salary': [75000, 80000, 65000, 90000]
}

df = pd.DataFrame(data)

# Create a Pandas Excel writer using XlsxWriter as the engine
with pd.ExcelWriter('styled_employees.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Employees', index=False)

# Get workbook and worksheet objects
workbook = writer.book
worksheet = writer.sheets['Employees']

# Create a format for the header
header_format = workbook.add_format({
'bold': True,
'text_wrap': True,
'valign': 'top',
'fg_color': '#D7E4BC',
'border': 1
})

# Apply the header format to the first row
for col_num, value in enumerate(df.columns.values):
worksheet.write(0, col_num, value, header_format)

# Add a number format for the salary column
money_format = workbook.add_format({'num_format': '$#,##0'})
worksheet.set_column('D:D', None, money_format) # Apply to salary column

# Auto-fit columns
for i, col in enumerate(df.columns):
column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
worksheet.set_column(i, i, column_len)

This example creates an Excel file with:

  • Formatted headers with background color and borders
  • Currency formatting for the salary column
  • Auto-fitted columns

Real-World Example: Sales Report Export

Let's create a more comprehensive example of a sales report that might be used in a business context:

python
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

# Create sample sales data
np.random.seed(42) # For reproducible results
dates = pd.date_range(start='2023-01-01', periods=100, freq='D')
products = ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard']
regions = ['North', 'South', 'East', 'West']

sales_data = []
for _ in range(500):
date = np.random.choice(dates)
product = np.random.choice(products)
region = np.random.choice(regions)
quantity = np.random.randint(1, 10)
price = np.random.choice([899, 699, 499, 349, 79])
sales_data.append({
'Date': date,
'Product': product,
'Region': region,
'Quantity': quantity,
'Unit Price': price,
'Revenue': quantity * price
})

# Create DataFrame
sales_df = pd.DataFrame(sales_data)

# Create summary DataFrames
product_summary = sales_df.groupby('Product').agg({
'Quantity': 'sum',
'Revenue': 'sum'
}).reset_index()

region_summary = sales_df.groupby('Region').agg({
'Quantity': 'sum',
'Revenue': 'sum'
}).reset_index()

monthly_summary = sales_df.copy()
monthly_summary['Month'] = monthly_summary['Date'].dt.strftime('%Y-%m')
monthly_summary = monthly_summary.groupby('Month').agg({
'Quantity': 'sum',
'Revenue': 'sum'
}).reset_index()

# Export to Excel with formatting
with pd.ExcelWriter('sales_report.xlsx', engine='xlsxwriter') as writer:
# Write DataFrames to different sheets
sales_df.to_excel(writer, sheet_name='Raw Data', index=False)
product_summary.to_excel(writer, sheet_name='By Product', index=False)
region_summary.to_excel(writer, sheet_name='By Region', index=False)
monthly_summary.to_excel(writer, sheet_name='Monthly Trends', index=False)

# Get workbook and add formatting
workbook = writer.book

# Add formats
header_format = workbook.add_format({
'bold': True,
'bg_color': '#4F81BD',
'font_color': 'white',
'border': 1
})

money_format = workbook.add_format({'num_format': '$#,##0.00'})
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})

# Format each sheet
for sheet_name in ['Raw Data', 'By Product', 'By Region', 'Monthly Trends']:
worksheet = writer.sheets[sheet_name]
df = sales_df if sheet_name == 'Raw Data' else (
product_summary if sheet_name == 'By Product' else (
region_summary if sheet_name == 'By Region' else monthly_summary
))

# Format headers
for col_num, value in enumerate(df.columns.values):
worksheet.write(0, col_num, value, header_format)

# Set column formats
if 'Revenue' in df.columns:
revenue_col = df.columns.get_loc('Revenue')
worksheet.set_column(revenue_col, revenue_col, None, money_format)

if 'Unit Price' in df.columns:
price_col = df.columns.get_loc('Unit Price')
worksheet.set_column(price_col, price_col, None, money_format)

if 'Date' in df.columns:
date_col = df.columns.get_loc('Date')
worksheet.set_column(date_col, date_col, None, date_format)

# Auto-fit columns
for i, col in enumerate(df.columns):
if col != 'Date': # Skip date column as it's already formatted
column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
worksheet.set_column(i, i, column_len)
else:
worksheet.set_column(i, i, 12) # Fixed width for date column

# Add charts to the Product summary sheet
product_chart = workbook.add_chart({'type': 'column'})
product_chart.add_series({
'name': 'Revenue by Product',
'categories': ['By Product', 1, 0, len(product_summary), 0],
'values': ['By Product', 1, 2, len(product_summary), 2],
})
product_chart.set_title({'name': 'Revenue by Product'})
product_chart.set_size({'width': 500, 'height': 300})
writer.sheets['By Product'].insert_chart('E2', product_chart)

This example creates a comprehensive sales report with:

  • Raw data sheet with all sales records
  • Summary sheets for products, regions, and monthly trends
  • Proper formatting for headers, currency values, and dates
  • A chart showing revenue by product

Common Issues and Solutions

Issue 1: Missing Excel Engine

If you encounter an error like ImportError: Missing optional dependency 'openpyxl', you need to install the appropriate Excel engine:

python
# Install the required engine
# pip install openpyxl # For .xlsx files
# pip install xlwt # For .xls files (legacy)

Issue 2: Excel File Already Exists

If you're running your code repeatedly, you might encounter a PermissionError if the Excel file is still open. Always close the file before running your code again or use a different filename.

Issue 3: Date and Time Formatting Issues

Excel can sometimes misinterpret date formats. Use explicit formatting:

python
with pd.ExcelWriter('dates_example.xlsx', engine='xlsxwriter') as writer:
df_with_dates.to_excel(writer, index=False)
workbook = writer.book
worksheet = writer.sheets['Sheet1']
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
date_col_idx = df_with_dates.columns.get_loc('date_column')
worksheet.set_column(date_col_idx, date_col_idx, None, date_format)

Summary

Exporting Pandas DataFrames to Excel is a versatile process that allows you to:

  • Create basic Excel exports with minimal code
  • Add formatting and styling to your Excel files
  • Export multiple DataFrames to different sheets in the same file
  • Create professional reports with charts and formatting
  • Customize exports to match your exact requirements

Whether you're creating a simple data export or a complex formatted report, Pandas provides the tools you need to generate Excel files for any purpose.

Additional Resources

Exercises

  1. Basic Export: Create a DataFrame with student grades and export it to Excel without the index.
  2. Multiple Sheets: Create two DataFrames (one for students, one for courses) and export them to different sheets in the same Excel file.
  3. Formatted Report: Create a sales report DataFrame and export it with proper formatting (headers, currency format for prices).
  4. Conditional Formatting: Export a DataFrame with conditional formatting that highlights cells above a certain threshold.
  5. Charts and Graphs: Export a DataFrame and add a bar chart representation of the data.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)