Python Documentation
Introduction
Good documentation is a crucial aspect of writing high-quality Python code. Well-documented code is easier to maintain, share with others, and return to after time away. In this guide, we'll explore Python's documentation best practices, covering everything from inline comments to formal documentation using tools like Sphinx.
Documentation serves multiple purposes:
- Helping others understand your code
- Reminding your future self how your code works
- Making collaboration easier
- Enabling automatic generation of reference materials
- Supporting maintainability over time
Let's dive into how you can document your Python code effectively!
Basic Documentation Tools in Python
Comments
The most basic form of documentation in Python is the comment. Comments start with the #
symbol and are ignored by the Python interpreter.
# This is a simple comment
x = 10 # This comment is at the end of a line
# Multiple line comments
# can be written
# like this
Best practices for comments:
- Use comments to explain "why" rather than "what"
- Keep comments up-to-date with code changes
- Don't state the obvious
- Use clear and concise language
Docstrings
Docstrings are string literals that appear right after the definition of a function, class, or module. Unlike comments, docstrings are retained at runtime and accessible via the __doc__
attribute.
Here's a basic function with a docstring:
def calculate_area(length, width):
"""Calculate the area of a rectangle.
Args:
length (float): The length of the rectangle.
width (float): The width of the rectangle.
Returns:
float: The area of the rectangle.
"""
return length * width
# Accessing the docstring
print(calculate_area.__doc__)
Output:
Calculate the area of a rectangle.
Args:
length (float): The length of the rectangle.
width (float): The width of the rectangle.
Returns:
float: The area of the rectangle.
Docstring Formats
There are several convention styles for formatting docstrings in Python:
Google Style
def connect_to_database(host, user, password):
"""Connects to the specified database.
Args:
host (str): The database host address.
user (str): The username for authentication.
password (str): The password for authentication.
Returns:
Connection: A database connection object.
Raises:
ConnectionError: If connection to the database fails.
Examples:
>>> conn = connect_to_database('localhost', 'user', 'pass123')
>>> conn.is_connected()
True
"""
# Function implementation here
pass
reStructuredText (reST) Style
def divide(a, b):
"""Divide two numbers.
:param a: The dividend
:type a: int or float
:param b: The divisor
:type b: int or float, nonzero
:returns: The quotient of a and b
:rtype: float
:raises ZeroDivisionError: If b is zero
.. note:: This function handles integers and floats only.
"""
return a / b
NumPy Style
def calculate_statistics(data):
"""
Calculate basic statistics for a dataset.
Parameters
----------
data : array_like
Input data, should be 1-D array of numeric values.
Returns
-------
dict
Dictionary containing:
'mean' : float
The arithmetic mean.
'median' : float
The median value.
'std' : float
Standard deviation.
Examples
--------
>>> stats = calculate_statistics([1, 2, 3, 4, 5])
>>> stats['mean']
3.0
"""
# Function implementation here
pass
Documenting Modules and Packages
Module and package-level documentation should be placed at the top of the file as a docstring:
"""
Data Processing Module
This module provides utilities for processing CSV data files,
including reading, filtering, and transforming operations.
Examples:
>>> from data_processing import read_csv
>>> data = read_csv('input.csv')
>>> filtered_data = filter_by_date(data, '2023-01-01')
"""
import pandas as pd
def read_csv(filename):
"""Read a CSV file into a pandas DataFrame."""
return pd.read_csv(filename)
# More functions below...
Documentation Tools
Help Function
Python's built-in help()
function can display docstrings for modules, functions, classes, and methods:
def greet(name):
"""Return a greeting message for the given name."""
return f"Hello, {name}!"
# Use help() to see the docstring
help(greet)
Output:
Help on function greet in module __main__:
greet(name)
Return a greeting message for the given name.
Sphinx Documentation Generator
Sphinx is a powerful tool for creating formal documentation from Python docstrings. It can generate HTML, PDF, and other formats.
To get started with Sphinx:
-
Install Sphinx:
bashpip install sphinx sphinx-rtd-theme
-
Create a docs directory and initialize Sphinx:
bashmkdir docs
cd docs
sphinx-quickstart -
Configure
conf.py
to include autodoc:pythonextensions = [
'sphinx.ext.autodoc',
'sphinx.ext.viewcode',
'sphinx.ext.napoleon', # For Google-style docstrings
] -
Create documentation from your docstrings:
bashsphinx-apidoc -o source/ ../your_package/
make html
Real-World Documentation Example
Let's look at a complete example of a well-documented Python module:
"""
User Management Module
This module handles user-related operations including registration,
authentication, and profile management.
It uses secure hashing for passwords and integrates with the database module
for persistent storage.
"""
import hashlib
import re
from datetime import datetime
from typing import Dict, Optional, Union
class User:
"""
Represents a user in the system.
Attributes:
username (str): Unique identifier for the user
email (str): User's email address
_password_hash (str): Hashed version of user's password
created_at (datetime): When the user account was created
last_login (datetime, optional): When the user last logged in
"""
def __init__(self, username: str, email: str, password: str):
"""
Initialize a new user.
Args:
username: Unique username for the user
email: User's email address
password: Plain text password (will be hashed)
Raises:
ValueError: If username or email format is invalid
"""
if not self._validate_username(username):
raise ValueError("Username must be 3-20 characters, alphanumeric with underscores only")
if not self._validate_email(email):
raise ValueError("Invalid email format")
self.username = username
self.email = email
self._password_hash = self._hash_password(password)
self.created_at = datetime.now()
self.last_login = None
@staticmethod
def _validate_username(username: str) -> bool:
"""
Validate username format.
Args:
username: Username to validate
Returns:
True if valid, False otherwise
"""
pattern = re.compile(r'^[a-zA-Z0-9_]{3,20}$')
return bool(pattern.match(username))
@staticmethod
def _validate_email(email: str) -> bool:
"""
Validate email format using a simple regex pattern.
Args:
email: Email to validate
Returns:
True if valid, False otherwise
"""
pattern = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
return bool(pattern.match(email))
@staticmethod
def _hash_password(password: str) -> str:
"""
Hash a password using SHA-256.
Args:
password: Plain text password
Returns:
Hashed password
"""
return hashlib.sha256(password.encode()).hexdigest()
def verify_password(self, password: str) -> bool:
"""
Check if the provided password matches the stored hash.
Args:
password: Plain text password to verify
Returns:
True if password matches, False otherwise
Examples:
>>> user = User("johndoe", "[email protected]", "password123")
>>> user.verify_password("password123")
True
>>> user.verify_password("wrong_password")
False
"""
return self._hash_password(password) == self._password_hash
def update_email(self, new_email: str) -> bool:
"""
Update the user's email address.
Args:
new_email: New email address
Returns:
True if update successful, False otherwise
Raises:
ValueError: If new email format is invalid
"""
if not self._validate_email(new_email):
raise ValueError("Invalid email format")
self.email = new_email
return True
def __str__(self) -> str:
"""Return string representation of user."""
return f"User(username='{self.username}', email='{self.email}')"
Best Practices for Python Documentation
- Be consistent - Choose a docstring style (Google, NumPy, or reST) and stick with it
- Document as you code - Write documentation while you're writing the code, not after
- Focus on clarity - Write for humans, not machines
- Document exceptions - Clearly state what exceptions can be raised and why
- Include examples - Practical examples make documentation more useful
- Keep it updated - Outdated documentation is often worse than no documentation
- Use type hints - Combine docstrings with type hints for clearer interfaces
- Document public APIs thoroughly - Private methods can have simpler documentation
Integrating Documentation into Development Workflow
Documentation in Code Reviews
When reviewing code, consider the following documentation checklist:
- Are all public functions, classes, and modules documented?
- Do complex algorithms have explanations?
- Are edge cases and exceptions documented?
- Are examples provided where appropriate?
Documentation Testing
Use doctest
to ensure examples in your docstrings actually work:
def add(a, b):
"""
Add two numbers together.
Args:
a: First number
b: Second number
Returns:
Sum of a and b
Examples:
>>> add(1, 2)
3
>>> add(-1, 1)
0
"""
return a + b
if __name__ == "__main__":
import doctest
doctest.testmod()
Running this file will execute the examples in the docstring and report if they don't match:
python your_file.py -v
Summary
Good documentation is an essential part of Python development:
- Use comments for explaining "why" in your code
- Write docstrings for functions, classes, and modules
- Choose a consistent docstring style (Google, NumPy, or reST)
- Use tools like Sphinx to generate formal documentation
- Include examples and keep documentation updated
- Integrate documentation into your development workflow
By following these best practices, you'll create code that's more maintainable, easier to collaborate on, and more professional.
Additional Resources
- PEP 257 - Docstring Conventions
- Google Python Style Guide
- Sphinx Documentation
- NumPy Docstring Guide
Exercises
- Take a previously written Python function and add a proper docstring following Google style conventions.
- Set up Sphinx documentation for a small Python project you've been working on.
- Write a function with examples in the docstring and use
doctest
to verify them. - Review an open-source Python project and observe how they structure their documentation.
- Create a cheat sheet of docstring formatting for your preferred style (Google, NumPy, or reST).
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)