Python Error Handling

Introduction

When you're writing Python code, especially for complex applications like those using PyTorch, things won't always go according to plan. Your program might encounter unexpected situations like missing files, network issues, invalid inputs, or memory limitations. Without proper error handling, these issues can cause your program to crash abruptly, leaving users confused and frustrated.

Error handling is a programming technique that anticipates potential problems and deals with them gracefully. In Python, this is primarily done through a mechanism called "exceptions." Rather than letting errors terminate your program, you can catch these exceptions and decide how to respond—whether that's displaying a helpful message, trying an alternative approach, or safely shutting down.

In this tutorial, we'll explore Python's exception handling system and learn how to make our PyTorch applications more robust and user-friendly.

Understanding Exceptions in Python

What are Exceptions?

In Python, exceptions are events that disrupt the normal flow of a program's instructions. When an error occurs during execution, Python creates an exception object. If this exception isn't handled, the program terminates and displays an error message.

Here's a simple example of an exception:

# This will cause a ZeroDivisionError
result = 10 / 0
print("This line won't be reached")

Output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

Common Python Exceptions

Before diving into handling exceptions, let's look at some common exceptions you might encounter:

Exception	Description
`SyntaxError`	Raised when the parser encounters a syntax error
`NameError`	Raised when a local or global name is not found
`TypeError`	Raised when an operation is applied to an object of inappropriate type
`ValueError`	Raised when a function gets an argument of correct type but inappropriate value
`IndexError`	Raised when an index is out of range
`KeyError`	Raised when a dictionary key is not found
`FileNotFoundError`	Raised when a file or directory is requested but doesn't exist
`ImportError`	Raised when an import statement fails
`ZeroDivisionError`	Raised when division or modulo by zero is encountered

PyTorch-specific exceptions include torch.cuda.OutOfMemoryError when your GPU runs out of memory.

Basic Exception Handling

The try-except Block

The fundamental construct for handling exceptions in Python is the try-except block:

try:
    # Code that might cause an exception
    result = 10 / 0
except ZeroDivisionError:
    # Code to handle the specific exception
    print("Cannot divide by zero!")

Output:

Cannot divide by zero!

The program continues running instead of crashing, which is much more user-friendly.

Handling Multiple Exceptions

You can handle different exceptions in different ways:

try:
    # This could cause different types of exceptions
    number = int(input("Enter a number: "))
    result = 10 / number
    print(f"Result: {result}")
except ZeroDivisionError:
    print("Cannot divide by zero!")
except ValueError:
    print("You must enter a valid number!")

If the user enters "0":

Enter a number: 0
Cannot divide by zero!

If the user enters "hello":

Enter a number: hello
You must enter a valid number!

Catching Multiple Exceptions with One Handler

You can also handle multiple exceptions with the same code:

try:
    # Code that might raise exceptions
    file = open("nonexistent_file.txt", "r")
    content = file.read()
    print(content)
except (FileNotFoundError, PermissionError):
    print("There was a problem accessing the file.")

Output:

There was a problem accessing the file.

The Catch-All Exception Handler

While it's generally better to catch specific exceptions, sometimes you might want to catch any potential exception:

try:
    # Some risky code
    x = 1 / 0
except Exception as e:
    print(f"An error occurred: {e}")

Output:

An error occurred: division by zero

Advanced Exception Handling

The else Clause

The else clause runs if no exceptions were raised in the try block:

try:
    number = int(input("Enter a positive number: "))
    if number <= 0:
        raise ValueError("That's not a positive number!")
except ValueError as err:
    print(f"Error: {err}")
else:
    print(f"You entered {number}, which is a valid positive number.")

If the user enters "5":

Enter a positive number: 5
You entered 5, which is a valid positive number.

If the user enters "-2":

Enter a positive number: -2
Error: That's not a positive number!

The finally Clause

The finally clause runs regardless of whether an exception occurred or not, making it perfect for cleanup operations:

try:
    file = open("sample_data.txt", "r")
    content = file.read()
    # Process content...
except FileNotFoundError:
    print("The file was not found.")
finally:
    # This code always runs
    try:
        file.close()
        print("File closed successfully.")
    except:
        print("No file to close.")

Raising Exceptions

Sometimes you might want to trigger exceptions manually using the raise statement:

def validate_age(age):
    if age < 0:
        raise ValueError("Age cannot be negative")
    if age > 120:
        raise ValueError("Age is too high")
    return True

try:
    validate_age(150)
except ValueError as e:
    print(f"Validation error: {e}")

Output:

Validation error: Age is too high

Creating Custom Exceptions

For specialized error handling, you can create your own exception classes:

class ModelError(Exception):
    """Exception raised for errors in the ML model."""
    pass

class DataShapeError(ModelError):
    """Exception raised when data doesn't match expected shape."""
    def __init__(self, expected_shape, actual_shape):
        self.expected_shape = expected_shape
        self.actual_shape = actual_shape
        self.message = f"Expected data shape {expected_shape}, got {actual_shape}"
        super().__init__(self.message)

# Using our custom exception
try:
    expected = (3, 224, 224)
    actual = (1, 128, 128)
    if expected != actual:
        raise DataShapeError(expected, actual)
except DataShapeError as e:
    print(f"Model input error: {e}")

Output:

Model input error: Expected data shape (3, 224, 224), got (1, 128, 128)

Error Handling in PyTorch

PyTorch operations can throw various exceptions, especially when working with tensors, GPU operations, or during model training. Let's look at some examples specific to PyTorch:

Handling CUDA Errors

When working with GPUs, you might encounter out-of-memory errors:

import torch

try:
    # Try to allocate an extremely large tensor
    huge_tensor = torch.ones(1000000, 1000000).cuda()
except torch.cuda.OutOfMemoryError:
    print("Not enough GPU memory for this operation!")
    # Maybe try with a smaller tensor or use CPU instead
    small_tensor = torch.ones(1000, 1000).cpu()

Shape Mismatch Errors

A common issue in PyTorch involves tensor shape mismatches:

import torch

try:
    # Create tensors with incompatible shapes
    tensor_a = torch.randn(10, 20)
    tensor_b = torch.randn(30, 40)
    
    # This will raise a RuntimeError due to shape mismatch
    result = torch.matmul(tensor_a, tensor_b)
except RuntimeError as e:
    print(f"Matrix operation error: {e}")
    print(f"Shape of tensor_a: {tensor_a.shape}")
    print(f"Shape of tensor_b: {tensor_b.shape}")
    print("For matrix multiplication, the inner dimensions must match.")

Output:

Matrix operation error: mat1 and mat2 shapes cannot be multiplied (10x20 and 30x40)
Shape of tensor_a: torch.Size([10, 20])
Shape of tensor_b: torch.Size([30, 40])
For matrix multiplication, the inner dimensions must match.

Graceful Fallbacks

A robust PyTorch application might include fallback mechanisms:

import torch

def train_model(use_gpu=True):
    try:
        if use_gpu and torch.cuda.is_available():
            device = torch.device("cuda")
            print("Using GPU for training")
        else:
            device = torch.device("cpu")
            print("Using CPU for training")
            
        # Create a simple model and move it to the selected device
        model = torch.nn.Linear(10, 1).to(device)
        
        # Generate some dummy data
        inputs = torch.randn(100, 10).to(device)
        targets = torch.randn(100, 1).to(device)
        
        # Define optimizer
        optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
        
        # Training loop
        for epoch in range(5):
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = torch.nn.functional.mse_loss(outputs, targets)
            loss.backward()
            optimizer.step()
            print(f"Epoch {epoch+1}/5, Loss: {loss.item():.4f}")
            
        return model
        
    except RuntimeError as e:
        if "CUDA" in str(e):
            print(f"GPU error: {e}")
            print("Falling back to CPU...")
            return train_model(use_gpu=False)
        else:
            raise  # Re-raise if it's not a CUDA error
            
model = train_model()

Best Practices for Error Handling

Here are some guidelines to make your error handling more effective:

Be specific: Catch specific exceptions rather than using bare except clauses
Don't silence exceptions: Avoid empty except blocks that hide errors
Log errors: In production code, log exceptions with context for debugging
Clean up resources: Use try-finally or context managers (e.g., with statements)
Provide helpful error messages: Make error messages informative for users
Fail early: Validate inputs at the beginning of functions
Don't use exceptions for flow control: Exceptions are for exceptional situations

Example of Good Error Handling

Here's a more complete example demonstrating good error handling practices:

import torch
import logging
from typing import Tuple, Optional

# Configure logging
logging.basicConfig(
    level=logging.INFO, 
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

def load_model(model_path: str) -> Optional[torch.nn.Module]:
    """
    Load a PyTorch model from a file with robust error handling.
    
    Args:
        model_path: Path to the saved model
        
    Returns:
        The loaded model or None if loading failed
    """
    try:
        # Check if file exists
        import os
        if not os.path.exists(model_path):
            raise FileNotFoundError(f"Model file not found at {model_path}")
            
        # Determine device
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        logger.info(f"Using device: {device}")
        
        # Load model
        model = torch.load(model_path, map_location=device)
        logger.info(f"Model successfully loaded from {model_path}")
        
        return model
        
    except FileNotFoundError as e:
        logger.error(f"File error: {e}")
        return None
    except RuntimeError as e:
        if "CUDA" in str(e):
            logger.warning(f"CUDA error: {e}")
            logger.info("Attempting to load on CPU instead...")
            return load_model_cpu_only(model_path)
        logger.error(f"Error loading model: {e}")
        return None
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        logger.exception("Stack trace:")
        return None

def load_model_cpu_only(model_path: str) -> Optional[torch.nn.Module]:
    """Fallback function to load model on CPU only."""
    try:
        model = torch.load(model_path, map_location="cpu")
        logger.info("Model loaded successfully on CPU")
        return model
    except Exception as e:
        logger.error(f"Failed to load model on CPU: {e}")
        return None

# Usage example
model = load_model("path/to/model.pth")
if model is not None:
    print("Model loaded successfully, ready for inference")
else:
    print("Failed to load model, please check the logs for details")

Summary

Error handling is a crucial aspect of writing robust Python applications, especially when working with PyTorch for machine learning tasks. In this tutorial, we've covered:

The basics of Python exceptions and how they work
Using try-except blocks to catch and handle errors
Advanced features like else and finally clauses
Creating and raising custom exceptions
Specific error handling scenarios in PyTorch
Best practices for effective error handling

By implementing proper error handling in your PyTorch projects, you can create applications that gracefully handle unexpected situations, provide helpful feedback to users, and ensure resources are properly managed.

Exercises

To practice your error handling skills, try these exercises:

Write a function that loads a dataset and uses error handling to deal with missing files or corrupted data.
Create a custom exception class for a specific error that might occur in your PyTorch model training.
Modify an existing PyTorch training loop to include proper error handling for out-of-memory errors.
Write a function that validates tensor shapes before performing operations and raises appropriate exceptions.
Implement a context manager using __enter__ and __exit__ for resource management in a PyTorch application.

Additional Resources

Python Documentation on Errors and Exceptions
PyTorch Forum - Common Errors and Solutions
Real Python: Python Exceptions: An Introduction
Effective Python: 90 Specific Ways to Write Better Python (Items 50-56 cover error handling)

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Understanding Exceptions in Python​

What are Exceptions?​

Common Python Exceptions​

Basic Exception Handling​

The try-except Block​

Handling Multiple Exceptions​

Catching Multiple Exceptions with One Handler​

The Catch-All Exception Handler​

Advanced Exception Handling​

The else Clause​

The finally Clause​

Raising Exceptions​

Creating Custom Exceptions​

Error Handling in PyTorch​

Handling CUDA Errors​

Shape Mismatch Errors​

Graceful Fallbacks​

Best Practices for Error Handling​

Example of Good Error Handling​

Summary​

Exercises​

Additional Resources​

Introduction

Understanding Exceptions in Python

What are Exceptions?

Common Python Exceptions

Basic Exception Handling

The try-except Block

Handling Multiple Exceptions

Catching Multiple Exceptions with One Handler

The Catch-All Exception Handler

Advanced Exception Handling

The else Clause

The finally Clause

Raising Exceptions

Creating Custom Exceptions

Error Handling in PyTorch

Handling CUDA Errors

Shape Mismatch Errors

Graceful Fallbacks

Best Practices for Error Handling

Example of Good Error Handling

Summary

Exercises

Additional Resources