Pandas Excel Export

Introduction

When working with data in Python, exporting your processed data to Excel is a common requirement. Excel remains one of the most widely used tools for data analysis and sharing across organizations. Pandas, Python's powerful data manipulation library, provides robust functionality to export DataFrame objects to Excel files with various customization options.

In this tutorial, you'll learn how to export your Pandas DataFrames to Excel files, apply formatting, work with multiple sheets, and handle various export configurations to create professional spreadsheets.

Prerequisites

Before we dive into exporting to Excel, make sure you have the following libraries installed:

# Install required libraries
# pip install pandas openpyxl xlsxwriter

You'll need either openpyxl or xlsxwriter as the Excel engine. Both have their advantages, but we'll primarily use openpyxl in this tutorial.

Basic Excel Export with Pandas

Let's start with the simplest way to export a DataFrame to Excel:

import pandas as pd

# Create a sample DataFrame
data = {
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    'Age': [28, 34, 29, 42],
    'City': ['New York', 'Paris', 'Berlin', 'London'],
    'Salary': [75000, 80000, 65000, 90000]
}

df = pd.DataFrame(data)

# Basic export to Excel
df.to_excel('employees.xlsx', index=False)

This creates a file named 'employees.xlsx' with our data. The index=False parameter prevents the DataFrame index from being included in the Excel file, which is often what you want unless the index contains meaningful information.

Configuring the Excel Export

The to_excel() method offers many customization options:

# Export with more configuration options
df.to_excel(
    'employees_config.xlsx',
    sheet_name='Employee Data',  # Custom sheet name
    index=False,                 # Don't include index
    freeze_panes=(1, 0),         # Freeze the first row
    engine='openpyxl'            # Specify the Excel engine
)

Here's what each parameter does:

sheet_name: Sets the name of the worksheet
freeze_panes: Freezes rows and columns for better navigation of large datasets
engine: Specifies which Excel writer to use ('openpyxl' or 'xlsxwriter')

Working with Excel Writers for More Control

For more advanced formatting, you can use Excel writers explicitly:

# Create an Excel writer
with pd.ExcelWriter('employees_formatted.xlsx', engine='openpyxl') as writer:
    # Write the DataFrame to Excel
    df.to_excel(writer, sheet_name='Employees', index=False)
    
    # Get the xlsxwriter worksheet object
    worksheet = writer.sheets['Employees']
    
    # Get the dimensions of the DataFrame
    (max_row, max_col) = df.shape
    
    # Set column widths
    for i, col in enumerate(df.columns):
        column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
        worksheet.column_dimensions[chr(65 + i)].width = column_len

This approach gives you access to the underlying Excel worksheet, allowing you to adjust column widths to fit your data better.

Exporting Multiple DataFrames to Different Sheets

You can export multiple DataFrames to different worksheets in the same Excel file:

import pandas as pd

# Create sample DataFrames
employees_df = pd.DataFrame({
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    'Age': [28, 34, 29, 42],
    'Department': ['IT', 'HR', 'IT', 'Finance']
})

salaries_df = pd.DataFrame({
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    'Salary': [75000, 80000, 65000, 90000],
    'Bonus': [5000, 6000, 3000, 10000]
})

# Export to multiple sheets
with pd.ExcelWriter('company_data.xlsx', engine='openpyxl') as writer:
    employees_df.to_excel(writer, sheet_name='Employees', index=False)
    salaries_df.to_excel(writer, sheet_name='Salaries', index=False)

This creates an Excel file with two sheets: 'Employees' and 'Salaries'.

Advanced Formatting with XlsxWriter

If you need more advanced formatting, XlsxWriter is a great option:

import pandas as pd

# Create a sample DataFrame
data = {
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    'Age': [28, 34, 29, 42],
    'City': ['New York', 'Paris', 'Berlin', 'London'],
    'Salary': [75000, 80000, 65000, 90000]
}

df = pd.DataFrame(data)

# Create a Pandas Excel writer using XlsxWriter as the engine
with pd.ExcelWriter('styled_employees.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Employees', index=False)
    
    # Get workbook and worksheet objects
    workbook = writer.book
    worksheet = writer.sheets['Employees']
    
    # Create a format for the header
    header_format = workbook.add_format({
        'bold': True,
        'text_wrap': True,
        'valign': 'top',
        'fg_color': '#D7E4BC',
        'border': 1
    })
    
    # Apply the header format to the first row
    for col_num, value in enumerate(df.columns.values):
        worksheet.write(0, col_num, value, header_format)
    
    # Add a number format for the salary column
    money_format = workbook.add_format({'num_format': '$#,##0'})
    worksheet.set_column('D:D', None, money_format)  # Apply to salary column
    
    # Auto-fit columns
    for i, col in enumerate(df.columns):
        column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
        worksheet.set_column(i, i, column_len)

This example creates an Excel file with:

Formatted headers with background color and borders
Currency formatting for the salary column
Auto-fitted columns

Real-World Example: Sales Report Export

Let's create a more comprehensive example of a sales report that might be used in a business context:

import pandas as pd
import numpy as np
from datetime import datetime, timedelta

# Create sample sales data
np.random.seed(42)  # For reproducible results
dates = pd.date_range(start='2023-01-01', periods=100, freq='D')
products = ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard']
regions = ['North', 'South', 'East', 'West']

sales_data = []
for _ in range(500):
    date = np.random.choice(dates)
    product = np.random.choice(products)
    region = np.random.choice(regions)
    quantity = np.random.randint(1, 10)
    price = np.random.choice([899, 699, 499, 349, 79])
    sales_data.append({
        'Date': date,
        'Product': product,
        'Region': region,
        'Quantity': quantity,
        'Unit Price': price,
        'Revenue': quantity * price
    })

# Create DataFrame
sales_df = pd.DataFrame(sales_data)

# Create summary DataFrames
product_summary = sales_df.groupby('Product').agg({
    'Quantity': 'sum', 
    'Revenue': 'sum'
}).reset_index()

region_summary = sales_df.groupby('Region').agg({
    'Quantity': 'sum',
    'Revenue': 'sum'
}).reset_index()

monthly_summary = sales_df.copy()
monthly_summary['Month'] = monthly_summary['Date'].dt.strftime('%Y-%m')
monthly_summary = monthly_summary.groupby('Month').agg({
    'Quantity': 'sum',
    'Revenue': 'sum'
}).reset_index()

# Export to Excel with formatting
with pd.ExcelWriter('sales_report.xlsx', engine='xlsxwriter') as writer:
    # Write DataFrames to different sheets
    sales_df.to_excel(writer, sheet_name='Raw Data', index=False)
    product_summary.to_excel(writer, sheet_name='By Product', index=False)
    region_summary.to_excel(writer, sheet_name='By Region', index=False)
    monthly_summary.to_excel(writer, sheet_name='Monthly Trends', index=False)
    
    # Get workbook and add formatting
    workbook = writer.book
    
    # Add formats
    header_format = workbook.add_format({
        'bold': True, 
        'bg_color': '#4F81BD', 
        'font_color': 'white',
        'border': 1
    })
    
    money_format = workbook.add_format({'num_format': '$#,##0.00'})
    date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
    
    # Format each sheet
    for sheet_name in ['Raw Data', 'By Product', 'By Region', 'Monthly Trends']:
        worksheet = writer.sheets[sheet_name]
        df = sales_df if sheet_name == 'Raw Data' else (
            product_summary if sheet_name == 'By Product' else (
            region_summary if sheet_name == 'By Region' else monthly_summary
        ))
        
        # Format headers
        for col_num, value in enumerate(df.columns.values):
            worksheet.write(0, col_num, value, header_format)
        
        # Set column formats
        if 'Revenue' in df.columns:
            revenue_col = df.columns.get_loc('Revenue')
            worksheet.set_column(revenue_col, revenue_col, None, money_format)
            
        if 'Unit Price' in df.columns:
            price_col = df.columns.get_loc('Unit Price')
            worksheet.set_column(price_col, price_col, None, money_format)
            
        if 'Date' in df.columns:
            date_col = df.columns.get_loc('Date')
            worksheet.set_column(date_col, date_col, None, date_format)
        
        # Auto-fit columns
        for i, col in enumerate(df.columns):
            if col != 'Date':  # Skip date column as it's already formatted
                column_len = max(df[col].astype(str).map(len).max(), len(col) + 2)
                worksheet.set_column(i, i, column_len)
            else:
                worksheet.set_column(i, i, 12)  # Fixed width for date column
    
    # Add charts to the Product summary sheet
    product_chart = workbook.add_chart({'type': 'column'})
    product_chart.add_series({
        'name': 'Revenue by Product',
        'categories': ['By Product', 1, 0, len(product_summary), 0],
        'values': ['By Product', 1, 2, len(product_summary), 2],
    })
    product_chart.set_title({'name': 'Revenue by Product'})
    product_chart.set_size({'width': 500, 'height': 300})
    writer.sheets['By Product'].insert_chart('E2', product_chart)

This example creates a comprehensive sales report with:

Raw data sheet with all sales records
Summary sheets for products, regions, and monthly trends
Proper formatting for headers, currency values, and dates
A chart showing revenue by product

Common Issues and Solutions

Issue 1: Missing Excel Engine

If you encounter an error like ImportError: Missing optional dependency 'openpyxl', you need to install the appropriate Excel engine:

# Install the required engine
# pip install openpyxl  # For .xlsx files
# pip install xlwt      # For .xls files (legacy)

Issue 2: Excel File Already Exists

If you're running your code repeatedly, you might encounter a PermissionError if the Excel file is still open. Always close the file before running your code again or use a different filename.

Issue 3: Date and Time Formatting Issues

Excel can sometimes misinterpret date formats. Use explicit formatting:

with pd.ExcelWriter('dates_example.xlsx', engine='xlsxwriter') as writer:
    df_with_dates.to_excel(writer, index=False)
    workbook = writer.book
    worksheet = writer.sheets['Sheet1']
    date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
    date_col_idx = df_with_dates.columns.get_loc('date_column')
    worksheet.set_column(date_col_idx, date_col_idx, None, date_format)

Summary

Exporting Pandas DataFrames to Excel is a versatile process that allows you to:

Create basic Excel exports with minimal code
Add formatting and styling to your Excel files
Export multiple DataFrames to different sheets in the same file
Create professional reports with charts and formatting
Customize exports to match your exact requirements

Whether you're creating a simple data export or a complex formatted report, Pandas provides the tools you need to generate Excel files for any purpose.

Additional Resources

Exercises

Basic Export: Create a DataFrame with student grades and export it to Excel without the index.
Multiple Sheets: Create two DataFrames (one for students, one for courses) and export them to different sheets in the same Excel file.
Formatted Report: Create a sales report DataFrame and export it with proper formatting (headers, currency format for prices).
Conditional Formatting: Export a DataFrame with conditional formatting that highlights cells above a certain threshold.
Charts and Graphs: Export a DataFrame and add a bar chart representation of the data.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Prerequisites​

Basic Excel Export with Pandas​

Configuring the Excel Export​

Working with Excel Writers for More Control​

Exporting Multiple DataFrames to Different Sheets​

Advanced Formatting with XlsxWriter​

Real-World Example: Sales Report Export​

Common Issues and Solutions​

Issue 1: Missing Excel Engine​

Issue 2: Excel File Already Exists​

Issue 3: Date and Time Formatting Issues​

Summary​

Additional Resources​

Exercises​