PyTorch Kornia
Introduction
Kornia is an open-source computer vision library built on top of PyTorch that provides differentiable implementations of computer vision algorithms. It's designed to solve generic computer vision problems and enables you to use these operations as part of your deep learning models. What makes Kornia special is that all of its operations are differentiable, meaning they can be used seamlessly within your PyTorch neural networks and allow for end-to-end training.
In this guide, we'll explore the basics of Kornia, its key features, and how to use it for various computer vision tasks.
Getting Started with Kornia
Installation
Before we dive into using Kornia, let's install it:
pip install kornia
Basic Imports
To start working with Kornia, you need to import it along with PyTorch:
import torch
import kornia
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
Core Features of Kornia
Kornia provides several modules that can be used for different computer vision tasks:
- Image Processing: Basic operations like color conversion, filtering, and geometric transformations
- Feature Detection and Description: Edge detection, corners, and feature descriptors
- Augmentation: Data augmentation for training deep learning models
- Depth Estimation: Tools for depth estimation and 3D reconstruction
- Optical Flow: Algorithms for computing optical flow
Let's explore some of these features in more detail.
Image Processing with Kornia
Loading and Converting Images
Let's start by loading an image and converting it to the format expected by Kornia:
import requests
from io import BytesIO
import torch
import kornia as K
import matplotlib.pyplot as plt
# Download a sample image
url = "https://raw.githubusercontent.com/kornia/kornia/master/docs/source/_static/img/lena.png"
response = requests.get(url)
img_pil = Image.open(BytesIO(response.content))
# Convert PIL image to tensor
img_tensor = K.image_to_tensor(np.array(img_pil)) # HWC -> CHW, scale to [0, 1]
img_tensor = img_tensor.float() / 255.0
print(f"Image tensor shape: {img_tensor.shape}")
# Visualize the image
plt.imshow(K.tensor_to_image(img_tensor))
plt.axis('off')
plt.show()
Output:
Image tensor shape: torch.Size([1, 3, 512, 512])
Basic Image Transformations
Kornia provides many basic image transformations. Let's see some examples:
Grayscale Conversion
# Convert to grayscale
gray_tensor = K.color.rgb_to_grayscale(img_tensor)
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original Image")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(K.tensor_to_image(gray_tensor), cmap='gray')
plt.title("Grayscale Image")
plt.axis('off')
plt.show()
print(f"Grayscale tensor shape: {gray_tensor.shape}")
Output:
Grayscale tensor shape: torch.Size([1, 1, 512, 512])
Rotation and Flipping
# Rotate image by 45 degrees
rotated = K.geometry.rotate(img_tensor, torch.tensor([45.0]))
# Flip image horizontally
flipped = K.geometry.hflip(img_tensor)
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(rotated))
plt.title("Rotated 45°")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(flipped))
plt.title("Horizontal Flip")
plt.axis('off')
plt.show()
Image Filtering and Edge Detection
Kornia makes it easy to apply various filters to your images and perform edge detection:
Blur and Sharpening
# Apply Gaussian blur
blurred = K.filters.gaussian_blur2d(img_tensor, (15, 15), (7.5, 7.5))
# Create a sharpening kernel
sharpen_kernel = torch.tensor([
[0, -1, 0],
[-1, 5, -1],
[0, -1, 0]
]).float().unsqueeze(0).unsqueeze(0)
# Expand kernel for RGB image
sharpen_kernel = sharpen_kernel.repeat(3, 1, 1, 1)
# Apply sharpening filter
sharpened = K.filters.filter2d(img_tensor, sharpen_kernel, border_type='replicate')
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(blurred))
plt.title("Gaussian Blur")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(sharpened))
plt.title("Sharpened")
plt.axis('off')
plt.show()
Edge Detection
# Convert to grayscale for edge detection
gray_tensor = K.color.rgb_to_grayscale(img_tensor)
# Apply Sobel filter
sobel_xy = K.filters.spatial_gradient(gray_tensor, mode='sobel', order=1)
sobel_mag = torch.sqrt(sobel_xy[:, :, 0] ** 2 + sobel_xy[:, :, 1] ** 2)
# Apply Canny edge detector
canny_edges = K.filters.canny(gray_tensor)[0]
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(sobel_mag), cmap='viridis')
plt.title("Sobel Edges")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(canny_edges), cmap='gray')
plt.title("Canny Edges")
plt.axis('off')
plt.show()
Data Augmentation with Kornia
One of the most powerful features of Kornia is its ability to perform differentiable data augmentation. This means you can include these augmentations as part of your model's training pipeline:
import kornia.augmentation as K
# Create a batch of images by duplicating our image 4 times
batch_tensor = img_tensor.repeat(4, 1, 1, 1)
print(f"Batch tensor shape: {batch_tensor.shape}")
# Define a set of augmentations
augmentations = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=30.0, p=0.5),
K.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=0.5),
K.RandomPerspective(distortion_scale=0.2, p=0.5),
data_keys=["input"]
)
# Apply augmentations
augmented_batch = augmentations(batch_tensor)
# Display augmented images
plt.figure(figsize=(15, 8))
for i in range(4):
plt.subplot(2, 2, i+1)
plt.imshow(K.tensor_to_image(augmented_batch[i]))
plt.title(f"Augmented Image {i+1}")
plt.axis('off')
plt.tight_layout()
plt.show()
Output:
Batch tensor shape: torch.Size([4, 3, 512, 512])
Practical Example: Image Classification with Kornia Augmentations
Let's see how to use Kornia augmentations in a PyTorch classification pipeline:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.models as models
import kornia.augmentation as K
# Define a simple classification model
model = models.resnet18(pretrained=True)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
# Define augmentation pipeline with Kornia
train_aug = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=10.0),
K.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1),
K.RandomErasing(p=0.2),
data_keys=["input"]
)
# Example training loop (simplified)
def train_batch(model, images, labels):
# Apply augmentations
images_aug = train_aug(images)
# Forward pass
outputs = model(images_aug)
loss = criterion(outputs, labels)
# Backward pass and optimize
optimizer.zero_grad()
loss.backward()
optimizer.step()
return loss.item()
# Example usage (would be part of a larger training loop)
# train_batch(model, batch_of_images, batch_of_labels)
Homography and Perspective Transformations
Kornia provides tools for geometric transformations like homography:
# Define source points (4 corners of the image)
h, w = img_tensor.shape[-2:]
points_src = torch.tensor([[
[0, 0],
[w-1, 0],
[w-1, h-1],
[0, h-1]
]], dtype=torch.float32)
# Define destination points (move corners to create perspective effect)
points_dst = torch.tensor([[
[w*0.1, h*0.1], # Top-left
[w*0.9, h*0.2], # Top-right
[w*0.8, h*0.8], # Bottom-right
[w*0.2, h*0.9] # Bottom-left
]], dtype=torch.float32)
# Compute homography
H = K.geometry.transform.get_perspective_transform(points_src, points_dst)
# Apply perspective transformation
warped_img = K.geometry.transform.warp_perspective(img_tensor, H, (h, w))
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original Image")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(K.tensor_to_image(warped_img))
plt.title("Perspective Transform")
plt.axis('off')
plt.show()
Summary
In this guide, we've explored Kornia, a powerful computer vision library built on top of PyTorch. We've seen how it provides differentiable implementations of various image processing and computer vision algorithms, making it ideal for deep learning workflows.
Key features we've covered include:
- Basic image processing (color conversion, geometric transformations)
- Image filtering and edge detection
- Data augmentation for deep learning
- Homography and perspective transformations
The main advantage of Kornia is that all operations are differentiable, which means they can be part of your neural network's computation graph and allow for end-to-end training.
Additional Resources
Exercises
- Load your own image and apply different Kornia transformations such as rotations, color changes, and edge detection.
- Create a custom data augmentation pipeline using Kornia and apply it to a dataset.
- Implement a simple image classification model with PyTorch that uses Kornia for data augmentation.
- Explore Kornia's feature detection modules and try to match keypoints between two similar images.
- Try to create a simple image stylization pipeline using Kornia's filtering capabilities.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)