Pandas Period Frequency
Introduction to Periods in Pandas
Time series data often requires analysis within specific time periods rather than exact timestamps. Pandas provides a powerful Period
object that represents time spans (like days, months, quarters) rather than specific moments in time. These periods are defined by their frequency, which determines the length of each period.
Understanding period frequencies is essential for:
- Working with financial data (fiscal quarters, years)
- Analyzing seasonal patterns
- Creating time-based groupings
- Resampling time series data into regular intervals
Period Basics
A pandas Period
represents a span of time, with a specific length determined by its frequency. Unlike timestamps that represent a specific moment, a period covers an entire interval.
import pandas as pd
# Creating a period with monthly frequency
monthly_period = pd.Period('2023-01', freq='M')
print(monthly_period)
# Creating a period with quarterly frequency
quarterly_period = pd.Period('2023Q1', freq='Q')
print(quarterly_period)
# Creating a period with yearly frequency
yearly_period = pd.Period('2023', freq='A')
print(yearly_period)
Output:
2023-01
2023Q1
2023
Common Frequency Aliases
Pandas uses frequency aliases to define the length of periods. Here are the most commonly used ones:
Alias | Description | Example |
---|---|---|
D | Calendar day | pd.Period('2023-01-01', freq='D') |
W | Weekly | pd.Period('2023-01-01', freq='W') |
M | Month end | pd.Period('2023-01', freq='M') |
Q | Quarter end | pd.Period('2023Q1', freq='Q') |
A or Y | Year end | pd.Period('2023', freq='A') |
H | Hourly | pd.Period('2023-01-01 12', freq='H') |
T or min | Minute | pd.Period('2023-01-01 12:00', freq='T') |
S | Second | pd.Period('2023-01-01 12:00:00', freq='S') |
Let's see them in action:
import pandas as pd
# Different frequency periods
periods = {
'Daily': pd.Period('2023-01-15', freq='D'),
'Weekly': pd.Period('2023-01-15', freq='W'),
'Monthly': pd.Period('2023-01', freq='M'),
'Quarterly': pd.Period('2023Q1', freq='Q'),
'Yearly': pd.Period('2023', freq='A'),
'Hourly': pd.Period('2023-01-15 10', freq='H'),
}
for name, period in periods.items():
print(f"{name}: {period}, Start: {period.start_time}, End: {period.end_time}")
Output:
Daily: 2023-01-15, Start: 2023-01-15 00:00:00, End: 2023-01-15 23:59:59.999999999
Weekly: 2023-01-15, Start: 2023-01-15 00:00:00, End: 2023-01-21 23:59:59.999999999
Monthly: 2023-01, Start: 2023-01-01 00:00:00, End: 2023-01-31 23:59:59.999999999
Quarterly: 2023Q1, Start: 2023-01-01 00:00:00, End: 2023-03-31 23:59:59.999999999
Yearly: 2023, Start: 2023-01-01 00:00:00, End: 2023-12-31 23:59:59.999999999
Hourly: 2023-01-15 10:00, Start: 2023-01-15 10:00:00, End: 2023-01-15 10:59:59.999999999
Creating PeriodIndex
A PeriodIndex
is a sequence of periods, all with the same frequency. It's extremely useful for indexing pandas Series and DataFrame objects:
import pandas as pd
import numpy as np
# Creating a PeriodIndex for monthly data
monthly_index = pd.period_range(start='2023-01', end='2023-12', freq='M')
monthly_data = pd.Series(np.random.randn(12), index=monthly_index)
print(monthly_data.head())
# Creating a PeriodIndex for quarterly data
quarterly_index = pd.period_range(start='2020Q1', periods=12, freq='Q')
quarterly_data = pd.Series(np.random.randn(12), index=quarterly_index)
print(quarterly_data.head())
Output:
2023-01 0.120915
2023-02 -0.326235
2023-03 0.583584
2023-04 0.154336
2023-05 0.546041
Freq: M, dtype: float64
2020Q1 0.255367
2020Q2 0.212217
2020Q3 -1.391039
2020Q4 0.391371
2021Q1 -0.711879
Freq: Q-DEC, dtype: float64
Period Arithmetic
Periods support arithmetic operations, allowing you to easily navigate through time:
import pandas as pd
period = pd.Period('2023-03', freq='M')
print(f"Current period: {period}")
print(f"Next period: {period + 1}")
print(f"Previous period: {period - 1}")
print(f"Three periods later: {period + 3}")
print(f"One year later: {period + 12}")
Output:
Current period: 2023-03
Next period: 2023-04
Previous period: 2023-02
Three periods later: 2023-06
One year later: 2024-03
Frequency Conversion
You can convert periods from one frequency to another using the asfreq()
method:
import pandas as pd
# Monthly period
month_period = pd.Period('2023-03', freq='M')
# Convert to different frequencies
print(f"Monthly: {month_period}")
print(f"Daily (month start): {month_period.asfreq('D', 'start')}")
print(f"Daily (month end): {month_period.asfreq('D', 'end')}")
print(f"Quarterly (containing this month): {month_period.asfreq('Q')}")
print(f"Yearly (containing this month): {month_period.asfreq('A')}")
Output:
Monthly: 2023-03
Daily (month start): 2023-03-01
Daily (month end): 2023-03-31
Quarterly (containing this month): 2023Q1
Yearly (containing this month): 2023
Advanced Period Frequencies
You can create more complex frequencies by adding specific modifiers:
import pandas as pd
# Business day frequency
business_daily = pd.Period('2023-01-02', freq='B')
print(f"Business day: {business_daily}")
# Month start frequency
month_start = pd.Period('2023-01', freq='MS')
print(f"Month start: {month_start}")
# Business month end
business_month_end = pd.Period('2023-01', freq='BM')
print(f"Business month end: {business_month_end}")
# Quarter start
quarter_start = pd.Period('2023Q1', freq='QS')
print(f"Quarter start: {quarter_start}")
# Year start
year_start = pd.Period('2023', freq='AS')
print(f"Year start: {year_start}")
Output:
Business day: 2023-01-02
Month start: 2023-01
Business month end: 2023-01
Quarter start: 2023Q1
Year start: 2023
Practical Example: Sales Analysis with Period Frequency
Let's analyze some monthly sales data using period frequency:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Sample monthly sales data for 3 years
np.random.seed(42)
monthly_index = pd.period_range(start='2020-01', end='2022-12', freq='M')
monthly_sales = pd.Series(
np.random.normal(loc=10000, scale=2000, size=len(monthly_index)),
index=monthly_index,
name='Sales'
)
# Create the DataFrame
sales_df = pd.DataFrame(monthly_sales)
print("Monthly sales data sample:")
print(sales_df.head())
# Resample to quarterly frequency
quarterly_sales = sales_df.resample('Q', axis=0).sum()
print("\nQuarterly sales data:")
print(quarterly_sales.head())
# Calculate yearly totals
yearly_sales = sales_df.resample('A', axis=0).sum()
print("\nYearly sales data:")
print(yearly_sales)
# Plot the data
plt.figure(figsize=(12, 6))
quarterly_sales.plot(kind='bar')
plt.title('Quarterly Sales')
plt.ylabel('Sales Amount')
plt.xlabel('Quarter')
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.tight_layout()
# In a real environment, you would use plt.show()
Output:
Monthly sales data sample:
Sales
2020-01 7529.950753
2020-02 9597.747461
2020-03 8632.066318
2020-04 9146.374370
2020-05 9340.896655
Quarterly sales data:
Sales
2020Q1 25759.8
2020Q2 28413.3
2020Q3 31015.7
2020Q4 29842.7
2021Q1 28514.9
Yearly sales data:
Sales
2020 115031.4
2021 119321.4
2022 120382.9
Converting Between Timestamps and Periods
You can convert between timestamps and periods for different analyses:
import pandas as pd
# Creating a DatetimeIndex
date_index = pd.date_range(start='2023-01-01', periods=5, freq='D')
ts_series = pd.Series(range(5), index=date_index)
print("Original timestamp series:")
print(ts_series)
# Convert to PeriodIndex
period_series = ts_series.to_period('D')
print("\nConverted to daily periods:")
print(period_series)
# Convert back to timestamps (at the start of each period)
ts_series_again = period_series.to_timestamp()
print("\nConverted back to timestamps:")
print(ts_series_again)
Output:
Original timestamp series:
2023-01-01 0
2023-01-02 1
2023-01-03 2
2023-01-04 3
2023-01-05 4
Freq: D, dtype: int64
Converted to daily periods:
2023-01-01 0
2023-01-02 1
2023-01-03 2
2023-01-04 3
2023-01-05 4
Freq: D, dtype: int64
Converted back to timestamps:
2023-01-01 0
2023-01-02 1
2023-01-03 2
2023-01-04 3
2023-01-05 4
dtype: int64
Custom Business Day Frequencies
For financial analysis, pandas provides special business day frequencies:
import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from pandas.tseries.holiday import USFederalHolidayCalendar
# Define a custom business day frequency excluding US holidays
us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())
# Create a range of business days
business_days = pd.period_range(
start='2023-12-20',
end='2024-01-10',
freq=us_bd
)
# Create a Series with these business days
business_series = pd.Series(range(len(business_days)), index=business_days)
print(business_series)
Output:
2023-12-20 0
2023-12-21 1
2023-12-22 2
2023-12-26 3
2023-12-27 4
2023-12-28 5
2023-12-29 6
2024-01-02 7
2024-01-03 8
2024-01-04 9
2024-01-05 10
2024-01-08 11
2024-01-09 12
2024-01-10 13
Freq: C, dtype: int64
Notice that weekends and holidays (like Christmas and New Year's Day) are excluded from the index.
Summary
In this lesson, we explored pandas Period Frequency, a powerful feature for working with time spans in time series data:
- Periods represent time spans rather than specific moments
- Frequency aliases define the length of periods (D, W, M, Q, Y, etc.)
- PeriodIndex provides a way to index data with periods
- Period arithmetic enables navigating through time easily
- Frequency conversion allows switching between different time spans
- Advanced features include custom business day frequencies and holiday calendars
Periods are particularly useful when your analysis focuses on entire intervals of time rather than specific timestamps, making them ideal for financial reporting, seasonal analysis, and time-based aggregations.
Additional Resources and Exercises
Resources
Exercises
-
Create a
PeriodIndex
of business quarters for the years 2022-2024 and construct a DataFrame with random data. -
Convert a DatetimeIndex of daily timestamps to monthly periods, then calculate the monthly mean.
-
Create a custom visualization that shows data aggregated by both monthly and quarterly periods.
-
Implement a function that converts daily data to the appropriate business month-end periods, considering holidays.
-
Create a period-based time series of temperature data and resample it from daily to monthly averages.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)