Skip to main content

Pandas Bar Plots

Bar plots are one of the most common and effective ways to visualize categorical data. They're especially useful for comparing values across different categories. In this tutorial, we'll explore how to create stunning bar plots using Pandas' built-in plotting capabilities.

Introduction to Bar Plots in Pandas

Pandas provides simple yet powerful functionality for creating bar plots directly from DataFrames and Series. Under the hood, Pandas leverages Matplotlib to generate these visualizations, but offers a more streamlined API that's perfect for quick data exploration.

Bar plots are ideal for:

  • Comparing values across categories
  • Displaying frequencies or counts
  • Showing distribution of categorical data
  • Visualizing before/after scenarios

Let's dive in and learn how to create these useful visualizations!

Basic Bar Plot with Pandas

To get started, we'll need to import the necessary libraries and create some sample data:

python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Set style for better aesthetics
plt.style.use('seaborn-v0_8')

# For Jupyter notebooks, use this to show plots inline
%matplotlib inline

# Create sample data
data = {'Product': ['Laptop', 'Phone', 'Monitor', 'Keyboard', 'Mouse'],
'Sales': [300, 400, 150, 80, 120]}

df = pd.DataFrame(data)
print(df)

This will output:

   Product  Sales
0 Laptop 300
1 Phone 400
2 Monitor 150
3 Keyboard 80
4 Mouse 120

Now, let's create a simple bar plot showing the sales of each product:

python
# Create a basic bar plot
df.plot(kind='bar', x='Product', y='Sales', figsize=(10, 6))
plt.title('Product Sales Comparison')
plt.ylabel('Sales (units)')
plt.xlabel('Products')
plt.show()

Basic Bar Plot

In this example:

  • kind='bar' specifies that we want a bar plot
  • x='Product' sets the x-axis labels
  • y='Sales' determines the height of each bar
  • figsize=(10, 6) sets the figure dimensions

Horizontal Bar Plots

Sometimes horizontal bar plots are more effective, especially when you have long category names. You can create them using barh:

python
# Create a horizontal bar plot
df.plot(kind='barh', x='Product', y='Sales', figsize=(10, 6))
plt.title('Product Sales Comparison')
plt.xlabel('Sales (units)')
plt.ylabel('Products')
plt.show()

Horizontal Bar Plot

Customizing Bar Plots

Let's enhance our bar plot with customizations:

python
# Create a customized bar plot
ax = df.plot(kind='bar', x='Product', y='Sales', figsize=(12, 7),
color='skyblue', edgecolor='black', width=0.7)

# Add title and labels with custom font sizes
plt.title('Product Sales Comparison', fontsize=16)
plt.ylabel('Sales (units)', fontsize=14)
plt.xlabel('Products', fontsize=14)

# Add data values on top of each bar
for i, v in enumerate(df['Sales']):
ax.text(i, v + 5, str(v), ha='center', fontsize=12)

# Customize grid
plt.grid(axis='y', linestyle='--', alpha=0.7)

# Customize ticks
plt.xticks(rotation=45)

plt.tight_layout()
plt.show()

Customized Bar Plot

Grouped Bar Plots

Grouped bar plots allow you to compare multiple variables across categories:

python
# Create sample data for grouped bar plot
data = {'Product': ['Laptop', 'Phone', 'Monitor', 'Keyboard', 'Mouse'],
'Sales_2021': [300, 400, 150, 80, 120],
'Sales_2022': [350, 450, 200, 70, 140]}

df_grouped = pd.DataFrame(data)
print(df_grouped)

# Create grouped bar plot
df_grouped.plot(kind='bar', x='Product', y=['Sales_2021', 'Sales_2022'],
figsize=(12, 7), width=0.7)

plt.title('Product Sales Comparison: 2021 vs 2022', fontsize=16)
plt.ylabel('Sales (units)', fontsize=14)
plt.xlabel('Products', fontsize=14)
plt.legend(['2021', '2022'])
plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()
plt.show()

Output:

   Product  Sales_2021  Sales_2022
0 Laptop 300 350
1 Phone 400 450
2 Monitor 150 200
3 Keyboard 80 70
4 Mouse 120 140

Grouped Bar Plot

Stacked Bar Plots

Stacked bar plots are useful for showing the composition of categories:

python
# Create stacked bar plot
df_grouped.plot(kind='bar', x='Product', y=['Sales_2021', 'Sales_2022'],
figsize=(12, 7), stacked=True, width=0.7)

plt.title('Product Sales Stacked: 2021 vs 2022', fontsize=16)
plt.ylabel('Total Sales (units)', fontsize=14)
plt.xlabel('Products', fontsize=14)
plt.legend(['2021', '2022'])
plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()
plt.show()

Stacked Bar Plot

Real-world Example: Sales Data Analysis

Let's work through a more comprehensive example analyzing monthly sales data:

python
# Create more realistic sample data
np.random.seed(42)
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

data = {
'Month': months,
'Electronics': np.random.randint(50000, 100000, 12),
'Clothing': np.random.randint(30000, 70000, 12),
'Home & Kitchen': np.random.randint(20000, 50000, 12),
'Books': np.random.randint(10000, 30000, 12)
}

sales_df = pd.DataFrame(data)
print(sales_df.head())

# Calculate total sales
sales_df['Total'] = sales_df[['Electronics', 'Clothing', 'Home & Kitchen', 'Books']].sum(axis=1)

# Find top 3 months by total sales
top_months = sales_df.sort_values('Total', ascending=False).head(3)['Month'].values
print(f"Top 3 months by sales: {', '.join(top_months)}")

Output:

  Month  Electronics  Clothing  Home & Kitchen  Books
0 Jan 51658 39771 25506 17269
1 Feb 92959 42415 25663 28287
2 Mar 56762 50366 47834 15367
3 Apr 69021 49401 22234 23236
4 May 56307 53194 30846 29437

Top 3 months by sales: Feb, Jul, Oct

Now, let's create an insightful visualization:

python
# Melt the dataframe to get it into the right format for plotting
plot_df = sales_df.melt(id_vars=['Month'],
value_vars=['Electronics', 'Clothing', 'Home & Kitchen', 'Books'],
var_name='Category', value_name='Sales')

# Create a grouped bar plot
plt.figure(figsize=(14, 8))
chart = sns.barplot(data=plot_df, x='Month', y='Sales', hue='Category')

# Highlight top months
for month in top_months:
idx = months.index(month)
plt.axvspan(idx-0.4, idx+0.4, alpha=0.1, color='red')

# Add title and labels
plt.title('Monthly Sales by Product Category', fontsize=16)
plt.ylabel('Sales ($)', fontsize=14)
plt.xlabel('Month', fontsize=14)
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)

# Add a text annotation for top months
plt.text(0.5, 0.95, f"Top months highlighted: {', '.join(top_months)}",
transform=plt.gca().transAxes, ha='center',
bbox=dict(facecolor='white', alpha=0.5))

plt.tight_layout()
plt.show()

For the above example, you'll need to add this import:

python
import seaborn as sns

Percentage Bar Plots

Sometimes, you want to show the relative proportions rather than absolute values:

python
# Calculate percentage contribution for each category
category_cols = ['Electronics', 'Clothing', 'Home & Kitchen', 'Books']
for col in category_cols:
sales_df[f'{col}_pct'] = sales_df[col] / sales_df['Total'] * 100

# Create percentage stacked bar plot
pct_cols = [f'{col}_pct' for col in category_cols]
sales_df.plot(kind='bar', x='Month', y=pct_cols,
figsize=(14, 8), stacked=True, width=0.8)

plt.title('Monthly Sales Composition by Category (%)', fontsize=16)
plt.ylabel('Percentage of Total Sales', fontsize=14)
plt.xlabel('Month', fontsize=14)
plt.xticks(rotation=45)
plt.legend(labels=category_cols)
plt.grid(axis='y', linestyle='--', alpha=0.7)

# Add percentage signs to y-axis
plt.gca().yaxis.set_major_formatter(plt.matplotlib.ticker.PercentFormatter())

plt.tight_layout()
plt.show()

Summary

In this tutorial, we've explored the versatility of Pandas for creating bar plots. We covered:

  • Basic vertical and horizontal bar plots
  • Customizing bar plots with colors, labels, and annotations
  • Creating grouped and stacked bar plots
  • Working through real-world examples with sales data
  • Visualizing percentage contributions

Bar plots are one of the most effective ways to compare values across categories, and Pandas makes it remarkably easy to create them from your data.

Additional Resources and Exercises

Additional Resources

Practice Exercises

  1. Basic Exercise: Create a bar plot showing the population of the top 10 most populous countries.

  2. Intermediate Exercise: Create a grouped bar plot comparing the quarterly revenue and profit for a company over the last 3 years.

  3. Advanced Exercise: Create a stacked percentage bar plot showing the market share of different smartphone manufacturers over time.

  4. Challenge: Create a bar plot with error bars showing average temperature by month with standard deviation indicators.

Remember, the best way to master bar plots is through practice. Try to incorporate them into your data analysis projects to gain more experience!

Happy plotting!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)