Python Task Scheduling
Task scheduling is a crucial aspect of many Python applications, allowing you to execute code at specified times or intervals. Whether you need to perform regular data backups, send automated reports, or update information periodically, understanding how to schedule tasks in Python will make your applications more powerful and autonomous.
Introduction to Task Scheduling
Task scheduling involves setting up your code to run automatically at predetermined times without manual intervention. Python offers several approaches to implement task scheduling:
- Built-in modules like
time
andthreading
- Specialized libraries like
schedule
andAPScheduler
- System-level integration with
cron
(Unix/Linux) or Task Scheduler (Windows)
In this tutorial, we'll explore these different methods and learn when to use each one.
Basic Scheduling with time
and threading
For simple scheduling needs, Python's built-in modules can be sufficient.
Using time.sleep()
The simplest form of scheduling is using time.sleep()
to pause execution for a specified number of seconds:
import time
def task():
print(f"Task executed at {time.strftime('%H:%M:%S')}")
# Run the task every 5 seconds
while True:
task()
time.sleep(5) # Sleep for 5 seconds
Output:
Task executed at 14:30:05
Task executed at 14:30:10
Task executed at 14:30:15
...
This approach is straightforward but has limitations:
- It blocks the main thread during sleep
- It doesn't account for task execution time
- It's difficult to manage multiple schedules
Using threading.Timer
For non-blocking scheduled tasks, you can use threading.Timer
:
import threading
import time
def scheduled_task():
print(f"Task executed at {time.strftime('%H:%M:%S')}")
# Schedule the next run
threading.Timer(5.0, scheduled_task).start()
# Start the first instance
scheduled_task()
Output:
Task executed at 14:35:00
Task executed at 14:35:05
Task executed at 14:35:10
...
This approach runs the task in a separate thread, keeping your main program responsive.
Using the schedule
Library
The schedule
library provides a clean, intuitive interface for scheduling tasks:
# First install: pip install schedule
import schedule
import time
def job():
print(f"Scheduled job running at {time.strftime('%H:%M:%S')}")
# Schedule job to run every 10 minutes
schedule.every(10).minutes.do(job)
# Schedule job to run every hour
schedule.every().hour.do(job)
# Schedule job to run at specific time every day
schedule.every().day.at("10:30").do(job)
# Schedule job to run every Monday
schedule.every().monday.do(job)
print("Scheduler started. Press Ctrl+C to exit.")
# Keep the program running to execute scheduled tasks
while True:
schedule.run_pending()
time.sleep(1)
Output:
Scheduler started. Press Ctrl+C to exit.
Scheduled job running at 14:40:00
Scheduled job running at 14:50:00
Scheduled job running at 15:00:00
...
The schedule
library is perfect for beginners due to its readable syntax and straightforward implementation.
Advanced Scheduling with APScheduler
For more complex scheduling needs, the Advanced Python Scheduler (APScheduler) library offers robust features:
# First install: pip install apscheduler
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.triggers.cron import CronTrigger
import datetime
def timed_job():
print(f"Timed job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
def interval_job():
print(f"Interval job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
def cron_job():
print(f"Cron job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
# Create a scheduler and start it
scheduler = BackgroundScheduler()
# Schedule a job to run every 2 minutes
scheduler.add_job(interval_job, IntervalTrigger(minutes=2))
# Schedule a job to run at specific times using cron expression
# This runs at 8:30 AM every weekday (Monday through Friday)
scheduler.add_job(
cron_job,
CronTrigger(hour=8, minute=30, day_of_week='mon-fri'),
)
# Schedule a one-time job
scheduler.add_job(
timed_job,
'date',
run_date=datetime.datetime.now() + datetime.timedelta(seconds=10)
)
scheduler.start()
print("Scheduler started. Press Ctrl+C to exit.")
try:
# Keep the main thread alive
while True:
import time
time.sleep(2)
except (KeyboardInterrupt, SystemExit):
scheduler.shutdown()
APScheduler provides several scheduler types:
BackgroundScheduler
: Runs in the background within your Python processBlockingScheduler
: Runs in the foreground, blocking executionAsyncIOScheduler
: Integrates with asyncio applications- Others for integration with specific frameworks
Real-World Examples
Let's explore some practical applications of task scheduling in Python.
Example 1: Automated Data Backup
import schedule
import time
import os
import datetime
import shutil
def backup_database():
# Create a timestamp for the backup file
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
source_dir = "/path/to/database"
backup_dir = f"/path/to/backups/backup_{timestamp}"
print(f"Starting backup at {timestamp}...")
# Create backup using shutil
try:
shutil.copytree(source_dir, backup_dir)
print(f"Backup completed successfully: {backup_dir}")
except Exception as e:
print(f"Backup failed: {e}")
# Schedule backups daily at 2 AM
schedule.every().day.at("02:00").do(backup_database)
print("Backup scheduler started")
while True:
schedule.run_pending()
time.sleep(60) # Check every minute
Example 2: Periodic API Data Collection
from apscheduler.schedulers.background import BackgroundScheduler
import requests
import json
import datetime
import os
def fetch_weather_data():
api_key = "your_api_key" # Replace with your actual API key
city = "New York"
url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}&units=metric"
try:
response = requests.get(url)
data = response.json()
# Format the data
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
weather = {
"timestamp": timestamp,
"temperature": data["main"]["temp"],
"humidity": data["main"]["humidity"],
"description": data["weather"][0]["description"]
}
# Append to a log file
with open("weather_log.json", "a") as f:
f.write(json.dumps(weather) + "\n")
print(f"Weather data collected at {timestamp}: {weather['temperature']}°C, {weather['description']}")
except Exception as e:
print(f"Failed to collect weather data: {e}")
# Initialize the scheduler
scheduler = BackgroundScheduler()
# Collect data every hour
scheduler.add_job(fetch_weather_data, 'interval', hours=1)
# Start the scheduler
scheduler.start()
print("Weather data collector started. Press Ctrl+C to exit.")
try:
# Keep the main thread alive
while True:
import time
time.sleep(60)
except (KeyboardInterrupt, SystemExit):
scheduler.shutdown()
Example 3: Email Report Scheduler
import schedule
import time
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import datetime
import pandas as pd
def generate_report():
# This is where you'd normally query a database or analyze data
# For this example, we'll just create some dummy data
data = {
'date': datetime.datetime.now().strftime("%Y-%m-%d"),
'sales': 1250,
'customers': 47,
'avg_sale': 26.60
}
return data
def send_email_report():
report_data = generate_report()
# Email configuration
sender_email = "[email protected]"
receiver_email = "[email protected]"
password = "your_password" # Consider using environment variables for security
# Create message
message = MIMEMultipart()
message["From"] = sender_email
message["To"] = receiver_email
message["Subject"] = f"Daily Sales Report - {report_data['date']}"
# Email body
body = f"""
<html>
<body>
<h2>Daily Sales Report</h2>
<p>Date: {report_data['date']}</p>
<table border="1">
<tr><th>Metric</th><th>Value</th></tr>
<tr><td>Total Sales</td><td>${report_data['sales']}</td></tr>
<tr><td>Customer Count</td><td>{report_data['customers']}</td></tr>
<tr><td>Average Sale</td><td>${report_data['avg_sale']:.2f}</td></tr>
</table>
</body>
</html>
"""
message.attach(MIMEText(body, "html"))
try:
# Connect to the server
server = smtplib.SMTP("smtp.gmail.com", 587)
server.starttls()
server.login(sender_email, password)
# Send email
text = message.as_string()
server.sendmail(sender_email, receiver_email, text)
server.quit()
print(f"Email report sent successfully at {datetime.datetime.now().strftime('%H:%M:%S')}")
except Exception as e:
print(f"Failed to send email: {e}")
# Schedule the report to be sent every weekday at 8:00 AM
schedule.every().monday.at("08:00").do(send_email_report)
schedule.every().tuesday.at("08:00").do(send_email_report)
schedule.every().wednesday.at("08:00").do(send_email_report)
schedule.every().thursday.at("08:00").do(send_email_report)
schedule.every().friday.at("08:00").do(send_email_report)
print("Email report scheduler started")
while True:
schedule.run_pending()
time.sleep(60) # Check every minute
System-Level Scheduling
For production environments, it's often better to use system-level schedulers.
Using cron (Unix/Linux)
Create a Python script:
# scheduled_task.py
import datetime
import logging
logging.basicConfig(
filename='scheduled_task.log',
level=logging.INFO,
format='%(asctime)s - %(message)s'
)
def main():
now = datetime.datetime.now()
logging.info(f"Task executed successfully at {now}")
print(f"Task executed at {now}")
if __name__ == "__main__":
main()
Then set up a cron job by running crontab -e
and adding:
# Run every day at 3 AM
0 3 * * * /usr/bin/python3 /path/to/scheduled_task.py
Using Task Scheduler (Windows)
On Windows, you can create a batch file to run your Python script:
@echo off
REM scheduled_task.bat
python C:\path\to\scheduled_task.py
Then use Windows Task Scheduler to run this batch file at your desired schedule.
Best Practices for Task Scheduling
-
Error Handling: Always wrap your scheduled functions in try-except blocks to prevent crashes.
-
Logging: Implement comprehensive logging to track execution and troubleshoot issues.
-
Avoid Overlapping Executions: For long-running tasks, ensure one execution finishes before the next starts.
-
Handle Time Zones Carefully: Be explicit about time zones, especially for global applications.
-
Resource Management: Monitor memory usage and connections, especially for long-running schedulers.
-
Persistence: For critical schedules, consider using a persistent job store with APScheduler.
Summary
Python task scheduling enables you to automate repetitive tasks and run code at specific times. We've covered multiple approaches:
- Basic scheduling with
time.sleep()
andthreading.Timer
- The user-friendly
schedule
library for simple needs - Advanced Python Scheduler (APScheduler) for complex requirements
- System-level scheduling using cron and Task Scheduler
Choose your scheduling method based on your application's requirements:
- Use
schedule
for simple, readable scheduling within Python applications - Use APScheduler for complex scheduling patterns or when you need persistence
- Use system-level schedulers for production environments and critical tasks
Exercises
- Create a script that checks a website's status every 5 minutes and logs the result.
- Build a daily file cleanup utility that removes files older than 7 days from a directory.
- Create an application that collects data from a free API at different intervals and stores the results.
- Implement a reminder system that sends notifications at scheduled times.
- Create a task scheduler that adjusts its timing based on system load or other conditions.
Additional Resources
- APScheduler Documentation
- Schedule Library GitHub
- Python's datetime module
- Crontab Guru - A helpful cron expression editor
- Windows Task Scheduler Documentation
By mastering Python task scheduling, you've added a powerful tool to your programming toolkit that enables you to create more autonomous, efficient applications.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)