Pandas Pie Charts
Introduction
Pie charts are a popular way to visualize data that represents parts of a whole, showing the proportional relationship between different categories. In pandas, you can easily create pie charts using the plotting capabilities built on top of Matplotlib. This guide will walk you through creating, customizing, and interpreting pie charts with pandas to effectively visualize your data.
Pie charts work best when:
- You have a small number of categories (typically less than 7)
- You want to show proportions of a whole
- The differences between values are significant enough to be visible
Basic Pie Chart Creation
Let's start with a simple pie chart example using pandas.
Setup
First, import the necessary libraries:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Set the style for better visualizations
plt.style.use('ggplot')
Creating a Simple Pie Chart
Let's create a basic dataset showing the distribution of expenses in a monthly budget:
# Create a simple Series with expense categories
expenses = pd.Series([1200, 600, 400, 300, 200],
index=['Housing', 'Food', 'Transportation', 'Entertainment', 'Utilities'],
name='Monthly Expenses')
print(expenses)
Output:
Housing 1200
Food 600
Transportation 400
Entertainment 300
Utilities 200
Name: Monthly Expenses, dtype: int64
Now, let's visualize this data as a pie chart:
# Create a pie chart from the Series
expenses.plot.pie(figsize=(10, 6), autopct='%1.1f%%', startangle=90)
plt.title('Monthly Expense Distribution', fontsize=15)
plt.ylabel('') # Hide the y-label
plt.show()
In this code:
figsize=(10, 6)
: Sets the figure sizeautopct='%1.1f%%'
: Displays the percentage on each wedge with one decimal placestartangle=90
: Rotates the start of the pie chart by 90 degrees- We set an empty
ylabel
to hide the default y-label
Customizing Pie Charts
Now, let's explore how to customize the pie chart for better visualization.
Adding Exploded Slices
You can "explode" or separate specific slices to emphasize them:
# Create an explode tuple - Housing and Food will be exploded
explode = (0.1, 0.1, 0, 0, 0) # Only "explode" Housing and Food
expenses.plot.pie(figsize=(10, 6),
autopct='%1.1f%%',
startangle=90,
explode=explode,
shadow=True)
plt.title('Monthly Expense Distribution', fontsize=15)
plt.ylabel('')
plt.show()
Customizing Colors and Styles
Let's customize the colors and add a legend:
# Custom colors
colors = ['#ff9999', '#66b3ff', '#99ff99', '#ffcc99', '#c2c2f0']
expenses.plot.pie(figsize=(10, 6),
autopct='%1.1f%%',
startangle=90,
colors=colors,
legend=False, # We'll add a separate legend
wedgeprops={'linewidth': 1, 'edgecolor': 'white'})
plt.title('Monthly Expense Distribution', fontsize=16, pad=20)
plt.ylabel('')
# Add a legend in a good position
plt.legend(expenses.index, loc='upper left', bbox_to_anchor=(1, 1))
# Tight layout to ensure everything fits
plt.tight_layout()
plt.show()
Creating Pie Charts from DataFrame Columns
Often, you'll want to create a pie chart from a column in a DataFrame. Let's see how to do this:
# Create a sample DataFrame
data = {
'Category': ['Housing', 'Food', 'Transportation', 'Entertainment', 'Utilities'],
'Amount': [1200, 600, 400, 300, 200]
}
df = pd.DataFrame(data)
print(df)
Output:
Category Amount
0 Housing 1200
1 Food 600
2 Transportation 400
3 Entertainment 300
4 Utilities 200
Now let's create a pie chart from the DataFrame:
# Create a pie chart using the DataFrame
plt.figure(figsize=(10, 6))
plt.pie(df['Amount'], labels=df['Category'], autopct='%1.1f%%', startangle=90,
colors=colors, wedgeprops={'linewidth': 1, 'edgecolor': 'white'})
plt.title('Monthly Expense Distribution', fontsize=16)
plt.axis('equal') # Equal aspect ratio ensures the pie chart is circular
plt.show()
Real-world Example: Product Sales Analysis
Let's use a more realistic example to analyze product sales data:
# Create a DataFrame with sales data
sales_data = pd.DataFrame({
'Product': ['Laptops', 'Phones', 'Tablets', 'Smartwatches', 'Headphones', 'Accessories'],
'Sales': [45000, 62000, 28000, 15000, 20000, 12000]
})
print(sales_data)
Output:
Product Sales
0 Laptops 45000
1 Phones 62000
2 Tablets 28000
3 Smartwatches 15000
4 Headphones 20000
5 Accessories 12000
Now, let's visualize this data with a pie chart:
# Calculate the percentage for each category
sales_data['Sales_Percent'] = sales_data['Sales'] / sales_data['Sales'].sum() * 100
sales_data = sales_data.sort_values('Sales', ascending=False)
# Highlight the top seller
explode = [0.1 if i == 0 else 0 for i in range(len(sales_data))]
# Create the pie chart
plt.figure(figsize=(12, 7))
plt.pie(sales_data['Sales'], labels=sales_data['Product'], explode=explode,
autopct=lambda p: f'{p:.1f}%\n(${p*sum(sales_data["Sales"])/100:.0f}k)',
startangle=90, colors=plt.cm.Paired(np.arange(len(sales_data))),
wedgeprops={'linewidth': 1, 'edgecolor': 'white'})
plt.title('Product Sales Distribution', fontsize=16, pad=20)
plt.axis('equal')
plt.tight_layout()
plt.show()
This example showcases:
- Sorting data to improve visualization
- Highlighting the top seller with explode
- Custom labeling to show both percentage and actual value
- Using a colormap for consistent color scheme
Pie Chart Best Practices
When using pie charts, consider these best practices:
- Limit the number of categories: Pie charts work best with 6 or fewer categories
- Order segments logically: Sort by size (largest to smallest) or in a logical order
- Use appropriate labels: Clear labels or a legend makes the chart more readable
- Consider alternatives for small differences: If segments are very similar in size, a bar chart might be better
- Use percentages: Display percentages to clearly show proportions
Alternative: Donut Chart
A donut chart is a variant of a pie chart with a hole in the center. It can be more visually appealing:
# Create a donut chart
plt.figure(figsize=(10, 6))
plt.pie(expenses, labels=expenses.index, autopct='%1.1f%%', startangle=90,
colors=colors, wedgeprops={'linewidth': 1, 'edgecolor': 'white'})
# Add a circle at the center to create a donut chart
centre_circle = plt.Circle((0,0), 0.70, fc='white')
fig = plt.gcf()
fig.gca().add_artist(centre_circle)
plt.title('Monthly Expense Distribution', fontsize=16)
plt.axis('equal')
plt.tight_layout()
plt.show()
Summary
In this guide, we've learned:
- How to create basic pie charts in pandas
- Techniques for customizing pie charts (colors, exploding slices, etc.)
- Creating pie charts from both Series and DataFrames
- Best practices for using pie charts effectively
- How to create variations like donut charts
Pie charts are excellent for showing proportional data when you have a limited number of categories with significant differences. Remember that while pie charts are visually appealing, they're not always the best choice for every dataset.
Additional Resources and Exercises
Further Reading
Exercises
-
Basic Practice: Create a pie chart showing the distribution of your weekly activities (sleeping, working, leisure, etc.).
-
Intermediate Practice: Generate a pie chart of population distribution by continent using real-world data. Add appropriate styling and labels.
-
Advanced Practice: Create a dashboard with both a pie chart and a bar chart showing the same data in different ways. Compare which visualization is more effective for your specific dataset.
-
Challenge: Create a nested pie chart (a pie chart with an inner and outer ring) to show hierarchical data, such as product categories and subcategories.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)