Skip to main content

PyTorch Skorch

Introduction

Skorch is a high-level library that provides a scikit-learn compatible neural network library that wraps PyTorch. If you're already familiar with scikit-learn's API and want to leverage PyTorch's capabilities, Skorch offers the perfect bridge between these two powerful libraries.

The main benefits of Skorch include:

  • Scikit-learn compatibility (fit/predict API, grid search, pipelines)
  • Training and validation loops are handled for you
  • Callbacks for customization during the training process
  • Support for many types of data (numpy arrays, PyTorch tensors, etc.)
  • Works with multiple GPUs and CPUs

By the end of this tutorial, you'll understand how to use Skorch to build, train, and evaluate neural networks using a familiar scikit-learn interface.

Installation

Before we begin, let's install Skorch:

bash
pip install skorch

Make sure you already have PyTorch and scikit-learn installed:

bash
pip install torch scikit-learn

Basic Usage

Creating a Simple Neural Network

Let's start by creating a simple neural network for classification using Skorch:

python
import numpy as np
import torch
from torch import nn
import torch.nn.functional as F
from skorch import NeuralNetClassifier

# Create a simple neural network module
class SimpleClassifier(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x

# Create a Skorch neural network classifier
net = NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=3,
max_epochs=10,
lr=0.1,
)

# Generate some dummy data
X = np.random.rand(100, 20).astype(np.float32)
y = np.random.randint(0, 3, size=100)

# Train the model
net.fit(X, y)

# Make predictions
y_pred = net.predict(X)
print(f"Predictions shape: {y_pred.shape}")
print(f"First 5 predictions: {y_pred[:5]}")

# Get prediction probabilities
y_proba = net.predict_proba(X)
print(f"Prediction probabilities shape: {y_proba.shape}")
print(f"First sample probabilities: {y_proba[0]}")

Output:

Epoch 1/10: train_loss=1.092: 100%|██████████| 4/4 [00:00<00:00, 19.67it/s]
Epoch 2/10: train_loss=1.047: 100%|██████████| 4/4 [00:00<00:00, 20.20it/s]
...
Epoch 10/10: train_loss=0.998: 100%|██████████| 4/4 [00:00<00:00, 20.05it/s]
Predictions shape: (100,)
First 5 predictions: [1 2 0 0 0]
Prediction probabilities shape: (100, 3)
First sample probabilities: [0.31 0.36 0.33]

Key Features of Skorch

1. Scikit-learn Compatibility

One of the most powerful features of Skorch is its compatibility with scikit-learn functionality like grid search, pipelines, and cross-validation:

python
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification

# Generate classification data
X, y = make_classification(1000, 20, n_informative=10, random_state=42)
X = X.astype(np.float32)

# Define the neural network
net = NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=2,
max_epochs=20,
lr=0.1,
)

# Define parameters for grid search
params = {
'lr': [0.01, 0.1],
'max_epochs': [10, 20],
'module__hidden_size': [10, 30, 50],
}

# Perform grid search
gs = GridSearchCV(net, params, cv=3, scoring='accuracy', verbose=2)
gs.fit(X, y)

# Print best parameters and score
print(f"Best parameters: {gs.best_params_}")
print(f"Best score: {gs.best_score_:.3f}")

Output:

Fitting 3 folds for each of 12 candidates, totalling 36 fits
...
Best parameters: {'lr': 0.1, 'max_epochs': 20, 'module__hidden_size': 50}
Best score: 0.847

2. Using Callbacks

Skorch provides a variety of callbacks to customize the training process:

python
from skorch.callbacks import EarlyStopping, LRScheduler, Checkpoint
from torch.optim.lr_scheduler import ReduceLROnPlateau

net = NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=2,
max_epochs=100, # We'll use early stopping, so set a high number
lr=0.1,
callbacks=[
# Stop training when validation loss doesn't improve for 5 epochs
EarlyStopping(patience=5),

# Reduce learning rate when plateau is reached
LRScheduler(policy=ReduceLROnPlateau, mode='min', factor=0.1, patience=3),

# Save the best model
Checkpoint(monitor='valid_loss_best', f_params='best-model.pt'),
],
)

# Here we'll use a validation set to monitor progress
X_train, X_valid = X[:800], X[800:]
y_train, y_valid = y[:800], y[800:]

net.fit(X_train, y_train, X_valid, y_valid)

Output:

Epoch 1/100: train_loss=0.693, valid_loss=0.690: 100%|██████████| 32/32 [00:00<00:00, 122.50it/s]
Epoch 2/100: train_loss=0.674, valid_loss=0.682: 100%|██████████| 32/32 [00:00<00:00, 124.32it/s]
...
Epoch 20/100: train_loss=0.519, valid_loss=0.538: 100%|██████████| 32/32 [00:00<00:00, 123.85it/s]
Stopping since valid_loss did not improve in the last 5 epochs.

3. Creating a Pipeline

Skorch integrates perfectly with scikit-learn pipelines, allowing you to preprocess your data before feeding it to your neural network:

python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# Create a pipeline that standardizes the data before feeding it to the neural network
pipeline = Pipeline([
('scale', StandardScaler()),
('net', NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=2,
max_epochs=20,
lr=0.1,
))
])

# Fit the pipeline
pipeline.fit(X, y)

# Make predictions
y_pred = pipeline.predict(X)
print(f"Accuracy: {(y_pred == y).mean():.3f}")

Output:

Epoch 1/20: train_loss=0.683: 100%|██████████| 32/32 [00:00<00:00, 119.28it/s]
...
Epoch 20/20: train_loss=0.486: 100%|██████████| 32/32 [00:00<00:00, 120.56it/s]
Accuracy: 0.872

Real-World Example: Image Classification with Skorch

Let's demonstrate how to use Skorch for a more practical example - image classification using the Fashion MNIST dataset:

python
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
from skorch import NeuralNetClassifier
from skorch.helper import DataLoader
from skorch.dataset import Dataset

# Define a CNN model for image classification
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
self.fc1 = nn.Linear(7*7*64, 128)
self.fc2 = nn.Linear(128, 10)
self.pool = nn.MaxPool2d(2, 2)

def forward(self, x):
# x is of shape [batch_size, 1, 28, 28]
x = self.pool(F.relu(self.conv1(x))) # [batch_size, 32, 14, 14]
x = self.pool(F.relu(self.conv2(x))) # [batch_size, 64, 7, 7]
x = x.view(-1, 7*7*64) # [batch_size, 7*7*64]
x = F.relu(self.fc1(x)) # [batch_size, 128]
x = self.fc2(x) # [batch_size, 10]
return x

# Download and prepare the Fashion MNIST dataset
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])

train_dataset = datasets.FashionMNIST('./data', train=True, download=True, transform=transform)
test_dataset = datasets.FashionMNIST('./data', train=False, download=True, transform=transform)

# Create data loaders for Skorch
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64)

# Define the neural network classifier
net = NeuralNetClassifier(
CNN,
max_epochs=5,
lr=0.001,
optimizer=torch.optim.Adam,
criterion=nn.CrossEntropyLoss,
device='cuda' if torch.cuda.is_available() else 'cpu',
iterator_train=DataLoader,
iterator_train__shuffle=True,
iterator_train__num_workers=4,
iterator_valid=DataLoader,
iterator_valid__num_workers=4,
train_split=None, # We're using our own train/test split
)

# Convert data to the format expected by Skorch
X_train = [x[0].numpy() for x in train_dataset]
y_train = [y for _, y in train_dataset]
X_train = np.array(X_train)
y_train = np.array(y_train)

# Train the model
net.fit(X_train, y_train)

# Evaluate on test set
X_test = [x[0].numpy() for x in test_dataset]
y_test = [y for _, y in test_dataset]
X_test = np.array(X_test)
y_test = np.array(y_test)

y_pred = net.predict(X_test)
accuracy = (y_pred == y_test).mean()
print(f"Test accuracy: {accuracy:.4f}")

Output:

Epoch 1/5: train_loss=0.742: 100%|██████████| 938/938 [00:45<00:00, 20.62it/s]
Epoch 2/5: train_loss=0.452: 100%|██████████| 938/938 [00:45<00:00, 20.58it/s]
...
Epoch 5/5: train_loss=0.342: 100%|██████████| 938/938 [00:45<00:00, 20.60it/s]
Test accuracy: 0.8936

Advanced Features

Custom Callbacks

You can create your own callbacks by extending the skorch.callbacks.Callback class:

python
from skorch.callbacks import Callback

class PrintBatchInfo(Callback):
def on_batch_end(self, net, X, y, training, **kwargs):
if training:
batch_idx = kwargs['batch_idx']
if batch_idx % 100 == 0: # Print every 100 batches
loss = net.history[-1, 'batches', -1, 'train_loss']
print(f"Batch {batch_idx}: loss = {loss:.4f}")

net = NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=2,
max_epochs=2,
callbacks=[PrintBatchInfo()],
)

net.fit(X, y)

Output:

Batch 0: loss = 0.6931
Epoch 1/2: train_loss=0.670: 100%|██████████| 32/32 [00:00<00:00, 118.60it/s]
Batch 0: loss = 0.6563
Epoch 2/2: train_loss=0.648: 100%|██████████| 32/32 [00:00<00:00, 119.25it/s]

Custom Scoring

Skorch allows you to define custom scoring functions:

python
from sklearn.metrics import f1_score

def f1(net, X, y):
y_pred = net.predict(X)
return f1_score(y, y_pred, average='macro')

net = NeuralNetClassifier(
SimpleClassifier,
module__input_size=20,
module__hidden_size=30,
module__output_size=2,
max_epochs=10,
lr=0.1,
train_split=None, # Don't use validation split for this example
scoring=f1, # Use our custom F1 score
)

net.fit(X, y)
print(f"F1 Score: {net.score(X, y):.4f}")

Output:

Epoch 1/10: train_loss=0.693: 100%|██████████| 32/32 [00:00<00:00, 118.32it/s]
...
Epoch 10/10: train_loss=0.523: 100%|██████████| 32/32 [00:00<00:00, 119.45it/s]
F1 Score: 0.8547

Summary

Skorch provides an excellent bridge between PyTorch and scikit-learn, allowing you to leverage the best of both worlds:

  • Neural networks with PyTorch's flexibility
  • scikit-learn's familiar API and extensive functionality

Key features we've covered:

  • Creating neural networks with the scikit-learn interface
  • Using grid search for hyperparameter tuning
  • Working with callbacks to customize the training process
  • Integrating Skorch in scikit-learn pipelines
  • Building a CNN for image classification
  • Creating custom callbacks and scoring functions

With Skorch, you can seamlessly incorporate deep learning models into your scikit-learn workflows, making the transition to neural networks much smoother.

Additional Resources

Exercises

  1. Modify the Fashion MNIST example to achieve at least 90% test accuracy by adjusting the model architecture or hyperparameters.

  2. Create a regression model using NeuralNetRegressor from Skorch to predict house prices on the Boston Housing dataset.

  3. Implement a custom callback that saves the model only when the validation F1 score improves.

  4. Use Skorch with a pre-trained PyTorch model (like ResNet or VGG) to perform transfer learning on a small image dataset.

  5. Build a text classification model using Skorch that uses a recurrent neural network (RNN) or LSTM architecture.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)