Pandas Area Plots
Area plots are a powerful visualization technique that helps you understand how different components contribute to a whole over a continuous interval, typically time. In this tutorial, we'll explore how to create and customize area plots using Pandas' built-in plotting capabilities.
Introduction to Area Plots
Area plots are essentially line plots where the area between the line and the axis is filled with color. They're particularly useful for:
- Visualizing cumulative values over time
- Comparing proportions of different categories
- Displaying stacked contributions to a total
Pandas makes creating these plots straightforward through its integration with Matplotlib, providing an easy-to-use interface for data visualization.
Basic Area Plot
Let's start with a simple area plot to understand the fundamentals. First, we'll need to import the necessary libraries:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Set the style for better visuals
plt.style.use('ggplot')
Now, let's create some sample data and plot it:
# Create a DataFrame with random data
dates = pd.date_range('2022-01-01', periods=12, freq='M')
data = pd.DataFrame({
'Sales': np.random.randint(100, 200, 12),
'Expenses': np.random.randint(50, 150, 12)
}, index=dates)
# Create a basic area plot
data.plot.area(figsize=(10, 6))
plt.title('Sales and Expenses Over Time')
plt.ylabel('Amount ($)')
plt.tight_layout()
plt.show()
In this example:
- We created a DataFrame with two columns: 'Sales' and 'Expenses'
- The
.plot.area()
method transforms this data into an area plot - By default, area plots are stacked, meaning each series starts where the previous one ends
Unstacked Area Plots
If you want to show each series independently rather than stacked, you can use the stacked
parameter:
# Create an unstacked area plot
data.plot.area(stacked=False, figsize=(10, 6), alpha=0.5)
plt.title('Sales and Expenses Over Time (Unstacked)')
plt.ylabel('Amount ($)')
plt.tight_layout()
plt.show()
Notice that we added alpha=0.5
to make the areas semi-transparent, which helps when areas overlap.
Customizing Area Plots
Let's explore how to customize our area plots further:
# Create a more customized area plot
ax = data.plot.area(
figsize=(10, 6),
color=['#5cb85c', '#d9534f'], # Custom colors
alpha=0.7, # Transparency
stacked=True # Stacked areas
)
# Customize the plot further
ax.set_title('Monthly Sales and Expenses (2022)', fontsize=16)
ax.set_ylabel('Amount ($)', fontsize=12)
ax.set_xlabel('Month', fontsize=12)
ax.legend(loc='upper left', frameon=True)
ax.grid(True, linestyle='--', alpha=0.7)
# Format y-axis with comma separator
import matplotlib.ticker as ticker
ax.yaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
plt.tight_layout()
plt.show()
Normalized Area Plots (Percentage)
Sometimes you want to see the relative proportions rather than absolute values. You can create normalized area plots (showing percentages) using the normalize
parameter:
# Generate more sample data with 4 categories
data_extended = pd.DataFrame({
'Product A': np.random.randint(10, 30, 12),
'Product B': np.random.randint(15, 40, 12),
'Product C': np.random.randint(20, 50, 12),
'Product D': np.random.randint(10, 35, 12)
}, index=dates)
# Create a normalized area plot (percentage)
ax = data_extended.plot.area(
figsize=(10, 6),
stacked=True,
normalize=True, # Convert to percentages
)
ax.set_title('Product Mix Over Time (Percentage)', fontsize=14)
ax.set_ylabel('Percentage (%)', fontsize=12)
ax.set_xlabel('Month', fontsize=12)
ax.set_ylim(0, 1)
ax.yaxis.set_major_formatter(ticker.PercentFormatter(xmax=1.0))
plt.tight_layout()
plt.show()
Real-world Example: Climate Data Visualization
Let's look at a practical example using climate data to show temperature ranges throughout the year:
# Create sample climate data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
climate_data = pd.DataFrame({
'Max Temp': [8, 10, 14, 18, 22, 26, 30, 29, 25, 20, 15, 10],
'Mean Temp': [5, 6, 9, 12, 16, 20, 24, 23, 19, 15, 10, 6],
'Min Temp': [1, 2, 4, 7, 11, 15, 18, 17, 13, 9, 5, 2]
}, index=months)
# Create a temperature range area plot
ax = climate_data.plot.area(
y=['Min Temp', 'Mean Temp', 'Max Temp'],
figsize=(12, 7),
stacked=False,
alpha=0.5,
color=['#3498db', '#2ecc71', '#e74c3c']
)
ax.set_title('Temperature Ranges Throughout the Year', fontsize=16)
ax.set_ylabel('Temperature (°C)', fontsize=12)
ax.set_xlabel('Month', fontsize=12)
ax.legend(loc='upper right')
ax.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
Finance Application: Portfolio Composition Over Time
Area plots are particularly useful for financial data. Let's visualize a portfolio's changing composition:
# Sample portfolio data over time
portfolio_data = pd.DataFrame({
'Stocks': [45000, 47000, 50000, 52000, 54000, 58000, 63000],
'Bonds': [30000, 31000, 30000, 29500, 29000, 28000, 27000],
'Real Estate': [15000, 15500, 16000, 16500, 17000, 17500, 18000],
'Cash': [10000, 8000, 7000, 6000, 5000, 6500, 8000]
}, index=pd.date_range('2023-01-01', periods=7, freq='M'))
# Create a portfolio composition plot
ax = portfolio_data.plot.area(
figsize=(10, 6),
stacked=True,
alpha=0.8,
cmap='viridis' # Color map for attractive color scheme
)
ax.set_title('Portfolio Composition Over Time', fontsize=16)
ax.set_ylabel('Value ($)', fontsize=12)
ax.set_xlabel('Date', fontsize=12)
ax.legend(loc='upper left')
# Format y-axis with dollar signs and commas
ax.yaxis.set_major_formatter(ticker.StrMethodFormatter('${x:,.0f}'))
# Calculate and show the total portfolio value
total_values = portfolio_data.sum(axis=1)
for i, total in enumerate(total_values):
ax.text(i, total + 500, f'${total:,.0f}',
ha='center', fontweight='bold')
plt.tight_layout()
plt.show()
Advanced Techniques: Highlighting a Specific Area
Sometimes you might want to draw attention to a specific area of your plot:
# Sample data for product sales across regions
regions_data = pd.DataFrame({
'North': [10, 13, 14, 12, 10, 15, 20, 22, 18, 15, 13, 10],
'South': [8, 7, 9, 12, 15, 18, 21, 19, 15, 12, 10, 8],
'East': [12, 11, 10, 13, 16, 18, 19, 20, 18, 14, 13, 11],
'West': [9, 10, 12, 13, 14, 16, 19, 21, 20, 16, 12, 10],
'Central': [11, 12, 13, 15, 17, 19, 22, 24, 21, 18, 15, 13]
}, index=pd.date_range('2023-01-01', periods=12, freq='M'))
# Create the base area plot
fig, ax = plt.subplots(figsize=(12, 7))
regions_data.plot.area(ax=ax, alpha=0.7, stacked=True, cmap='tab10')
# Highlight a specific region - "West"
west_bottom = regions_data[['North', 'South', 'East']].sum(axis=1).values
west_top = west_bottom + regions_data['West'].values
west_middle = pd.date_range('2023-01-01', periods=12, freq='M')
# Add a red outline to the West region
ax.plot(west_middle, west_bottom, color='red', linewidth=2)
ax.plot(west_middle, west_top, color='red', linewidth=2)
# Add some custom annotation
ax.annotate('Strong growth in West region',
xy=(west_middle[7], (west_bottom[7] + west_top[7])/2),
xytext=(west_middle[9], west_bottom[9] + 10),
arrowprops=dict(facecolor='black', shrink=0.05, width=1.5),
fontsize=12,
fontweight='bold')
ax.set_title('Regional Sales Distribution', fontsize=16)
ax.set_ylabel('Sales (Units)', fontsize=12)
ax.set_xlabel('Month', fontsize=12)
plt.tight_layout()
plt.show()
Summary
In this tutorial, we've explored Pandas Area Plots and their various applications:
- Basic area plots for showing cumulative values
- Unstacked area plots for comparing individual series
- Normalized area plots for showing percentages
- Customization options including colors, transparency, and annotations
- Real-world applications in climate data and financial analysis
Area plots are excellent for visualizing:
- Part-to-whole relationships over time
- Cumulative values
- Changes in composition
- Ranges of values
Exercises
To practice your skills with Pandas area plots, try these exercises:
- Create an area plot showing the distribution of time spent on different activities over a week
- Visualize a company's revenue breakdown by product category over several quarters
- Create a normalized area plot showing browser market share changes over time
- Build a climate visualization showing precipitation types (rain, snow, sleet) by month
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)