Skip to main content

Python Task Scheduling

Task scheduling is a crucial aspect of many Python applications, allowing you to execute code at specified times or intervals. Whether you need to perform regular data backups, send automated reports, or update information periodically, understanding how to schedule tasks in Python will make your applications more powerful and autonomous.

Introduction to Task Scheduling

Task scheduling involves setting up your code to run automatically at predetermined times without manual intervention. Python offers several approaches to implement task scheduling:

  1. Built-in modules like time and threading
  2. Specialized libraries like schedule and APScheduler
  3. System-level integration with cron (Unix/Linux) or Task Scheduler (Windows)

In this tutorial, we'll explore these different methods and learn when to use each one.

Basic Scheduling with time and threading

For simple scheduling needs, Python's built-in modules can be sufficient.

Using time.sleep()

The simplest form of scheduling is using time.sleep() to pause execution for a specified number of seconds:

python
import time

def task():
print(f"Task executed at {time.strftime('%H:%M:%S')}")

# Run the task every 5 seconds
while True:
task()
time.sleep(5) # Sleep for 5 seconds

Output:

Task executed at 14:30:05
Task executed at 14:30:10
Task executed at 14:30:15
...

This approach is straightforward but has limitations:

  • It blocks the main thread during sleep
  • It doesn't account for task execution time
  • It's difficult to manage multiple schedules

Using threading.Timer

For non-blocking scheduled tasks, you can use threading.Timer:

python
import threading
import time

def scheduled_task():
print(f"Task executed at {time.strftime('%H:%M:%S')}")
# Schedule the next run
threading.Timer(5.0, scheduled_task).start()

# Start the first instance
scheduled_task()

Output:

Task executed at 14:35:00
Task executed at 14:35:05
Task executed at 14:35:10
...

This approach runs the task in a separate thread, keeping your main program responsive.

Using the schedule Library

The schedule library provides a clean, intuitive interface for scheduling tasks:

python
# First install: pip install schedule
import schedule
import time

def job():
print(f"Scheduled job running at {time.strftime('%H:%M:%S')}")

# Schedule job to run every 10 minutes
schedule.every(10).minutes.do(job)

# Schedule job to run every hour
schedule.every().hour.do(job)

# Schedule job to run at specific time every day
schedule.every().day.at("10:30").do(job)

# Schedule job to run every Monday
schedule.every().monday.do(job)

print("Scheduler started. Press Ctrl+C to exit.")

# Keep the program running to execute scheduled tasks
while True:
schedule.run_pending()
time.sleep(1)

Output:

Scheduler started. Press Ctrl+C to exit.
Scheduled job running at 14:40:00
Scheduled job running at 14:50:00
Scheduled job running at 15:00:00
...

The schedule library is perfect for beginners due to its readable syntax and straightforward implementation.

Advanced Scheduling with APScheduler

For more complex scheduling needs, the Advanced Python Scheduler (APScheduler) library offers robust features:

python
# First install: pip install apscheduler
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.triggers.cron import CronTrigger
import datetime

def timed_job():
print(f"Timed job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

def interval_job():
print(f"Interval job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

def cron_job():
print(f"Cron job executed at {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

# Create a scheduler and start it
scheduler = BackgroundScheduler()

# Schedule a job to run every 2 minutes
scheduler.add_job(interval_job, IntervalTrigger(minutes=2))

# Schedule a job to run at specific times using cron expression
# This runs at 8:30 AM every weekday (Monday through Friday)
scheduler.add_job(
cron_job,
CronTrigger(hour=8, minute=30, day_of_week='mon-fri'),
)

# Schedule a one-time job
scheduler.add_job(
timed_job,
'date',
run_date=datetime.datetime.now() + datetime.timedelta(seconds=10)
)

scheduler.start()

print("Scheduler started. Press Ctrl+C to exit.")
try:
# Keep the main thread alive
while True:
import time
time.sleep(2)
except (KeyboardInterrupt, SystemExit):
scheduler.shutdown()

APScheduler provides several scheduler types:

  • BackgroundScheduler: Runs in the background within your Python process
  • BlockingScheduler: Runs in the foreground, blocking execution
  • AsyncIOScheduler: Integrates with asyncio applications
  • Others for integration with specific frameworks

Real-World Examples

Let's explore some practical applications of task scheduling in Python.

Example 1: Automated Data Backup

python
import schedule
import time
import os
import datetime
import shutil

def backup_database():
# Create a timestamp for the backup file
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
source_dir = "/path/to/database"
backup_dir = f"/path/to/backups/backup_{timestamp}"

print(f"Starting backup at {timestamp}...")

# Create backup using shutil
try:
shutil.copytree(source_dir, backup_dir)
print(f"Backup completed successfully: {backup_dir}")
except Exception as e:
print(f"Backup failed: {e}")

# Schedule backups daily at 2 AM
schedule.every().day.at("02:00").do(backup_database)

print("Backup scheduler started")
while True:
schedule.run_pending()
time.sleep(60) # Check every minute

Example 2: Periodic API Data Collection

python
from apscheduler.schedulers.background import BackgroundScheduler
import requests
import json
import datetime
import os

def fetch_weather_data():
api_key = "your_api_key" # Replace with your actual API key
city = "New York"
url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}&units=metric"

try:
response = requests.get(url)
data = response.json()

# Format the data
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
weather = {
"timestamp": timestamp,
"temperature": data["main"]["temp"],
"humidity": data["main"]["humidity"],
"description": data["weather"][0]["description"]
}

# Append to a log file
with open("weather_log.json", "a") as f:
f.write(json.dumps(weather) + "\n")

print(f"Weather data collected at {timestamp}: {weather['temperature']}°C, {weather['description']}")
except Exception as e:
print(f"Failed to collect weather data: {e}")

# Initialize the scheduler
scheduler = BackgroundScheduler()

# Collect data every hour
scheduler.add_job(fetch_weather_data, 'interval', hours=1)

# Start the scheduler
scheduler.start()

print("Weather data collector started. Press Ctrl+C to exit.")
try:
# Keep the main thread alive
while True:
import time
time.sleep(60)
except (KeyboardInterrupt, SystemExit):
scheduler.shutdown()

Example 3: Email Report Scheduler

python
import schedule
import time
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import datetime
import pandas as pd

def generate_report():
# This is where you'd normally query a database or analyze data
# For this example, we'll just create some dummy data
data = {
'date': datetime.datetime.now().strftime("%Y-%m-%d"),
'sales': 1250,
'customers': 47,
'avg_sale': 26.60
}
return data

def send_email_report():
report_data = generate_report()

# Email configuration
sender_email = "[email protected]"
receiver_email = "[email protected]"
password = "your_password" # Consider using environment variables for security

# Create message
message = MIMEMultipart()
message["From"] = sender_email
message["To"] = receiver_email
message["Subject"] = f"Daily Sales Report - {report_data['date']}"

# Email body
body = f"""
<html>
<body>
<h2>Daily Sales Report</h2>
<p>Date: {report_data['date']}</p>
<table border="1">
<tr><th>Metric</th><th>Value</th></tr>
<tr><td>Total Sales</td><td>${report_data['sales']}</td></tr>
<tr><td>Customer Count</td><td>{report_data['customers']}</td></tr>
<tr><td>Average Sale</td><td>${report_data['avg_sale']:.2f}</td></tr>
</table>
</body>
</html>
"""

message.attach(MIMEText(body, "html"))

try:
# Connect to the server
server = smtplib.SMTP("smtp.gmail.com", 587)
server.starttls()
server.login(sender_email, password)

# Send email
text = message.as_string()
server.sendmail(sender_email, receiver_email, text)
server.quit()
print(f"Email report sent successfully at {datetime.datetime.now().strftime('%H:%M:%S')}")
except Exception as e:
print(f"Failed to send email: {e}")

# Schedule the report to be sent every weekday at 8:00 AM
schedule.every().monday.at("08:00").do(send_email_report)
schedule.every().tuesday.at("08:00").do(send_email_report)
schedule.every().wednesday.at("08:00").do(send_email_report)
schedule.every().thursday.at("08:00").do(send_email_report)
schedule.every().friday.at("08:00").do(send_email_report)

print("Email report scheduler started")
while True:
schedule.run_pending()
time.sleep(60) # Check every minute

System-Level Scheduling

For production environments, it's often better to use system-level schedulers.

Using cron (Unix/Linux)

Create a Python script:

python
# scheduled_task.py
import datetime
import logging

logging.basicConfig(
filename='scheduled_task.log',
level=logging.INFO,
format='%(asctime)s - %(message)s'
)

def main():
now = datetime.datetime.now()
logging.info(f"Task executed successfully at {now}")
print(f"Task executed at {now}")

if __name__ == "__main__":
main()

Then set up a cron job by running crontab -e and adding:

# Run every day at 3 AM
0 3 * * * /usr/bin/python3 /path/to/scheduled_task.py

Using Task Scheduler (Windows)

On Windows, you can create a batch file to run your Python script:

batch
@echo off
REM scheduled_task.bat
python C:\path\to\scheduled_task.py

Then use Windows Task Scheduler to run this batch file at your desired schedule.

Best Practices for Task Scheduling

  1. Error Handling: Always wrap your scheduled functions in try-except blocks to prevent crashes.

  2. Logging: Implement comprehensive logging to track execution and troubleshoot issues.

  3. Avoid Overlapping Executions: For long-running tasks, ensure one execution finishes before the next starts.

  4. Handle Time Zones Carefully: Be explicit about time zones, especially for global applications.

  5. Resource Management: Monitor memory usage and connections, especially for long-running schedulers.

  6. Persistence: For critical schedules, consider using a persistent job store with APScheduler.

Summary

Python task scheduling enables you to automate repetitive tasks and run code at specific times. We've covered multiple approaches:

  • Basic scheduling with time.sleep() and threading.Timer
  • The user-friendly schedule library for simple needs
  • Advanced Python Scheduler (APScheduler) for complex requirements
  • System-level scheduling using cron and Task Scheduler

Choose your scheduling method based on your application's requirements:

  • Use schedule for simple, readable scheduling within Python applications
  • Use APScheduler for complex scheduling patterns or when you need persistence
  • Use system-level schedulers for production environments and critical tasks

Exercises

  1. Create a script that checks a website's status every 5 minutes and logs the result.
  2. Build a daily file cleanup utility that removes files older than 7 days from a directory.
  3. Create an application that collects data from a free API at different intervals and stores the results.
  4. Implement a reminder system that sends notifications at scheduled times.
  5. Create a task scheduler that adjusts its timing based on system load or other conditions.

Additional Resources

By mastering Python task scheduling, you've added a powerful tool to your programming toolkit that enables you to create more autonomous, efficient applications.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)