TensorFlow Pretrained Models
Introduction
Pretrained models are one of the most powerful tools in a deep learning practitioner's toolkit. Rather than training complex neural networks from scratch—which requires massive datasets and significant computational resources—you can leverage models that have already been trained on millions of images by research teams at organizations like Google and Microsoft.
In this tutorial, we'll explore how to use TensorFlow's pretrained models for various tasks, particularly focusing on Convolutional Neural Networks (CNNs) for image-related applications. You'll learn how to:
- Access and use models from TensorFlow Hub and Keras Applications
- Implement transfer learning with pretrained models
- Fine-tune pretrained models for your specific tasks
- Adapt these powerful models for real-world applications
Why Use Pretrained Models?
Before diving into the code, let's understand the advantages of using pretrained models:
- Save time and resources: Training deep neural networks from scratch can take days or weeks on powerful hardware.
- Better performance: Models pretrained on large datasets often generalize better than those trained on smaller datasets.
- Transfer learning: You can transfer knowledge learned from one task to another related task.
- Less data required: Fine-tuning a pretrained model typically requires less training data than building a model from scratch.
Accessing Pretrained Models in TensorFlow
TensorFlow offers two primary ways to access pretrained models:
- Keras Applications: Built directly into TensorFlow's high-level API
- TensorFlow Hub: A repository of reusable machine learning models
Let's explore both approaches.
Using Keras Applications
Keras Applications provides several state-of-the-art deep learning models with pretrained weights. Let's start with a simple example using VGG16:
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
# Load the VGG16 model pre-trained on ImageNet data
model = VGG16(weights='imagenet')
# Summary of the model architecture
model.summary()
# Load and preprocess an image
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
# Make a prediction
predictions = model.predict(x)
decoded_predictions = decode_predictions(predictions, top=5)[0]
# Print the results
for i, (imagenet_id, label, score) in enumerate(decoded_predictions):
print(f"{i+1}: {label} ({score:.2f})")
Output:
1: African_elephant (0.92)
2: tusker (0.07)
3: Indian_elephant (0.01)
4: warthog (0.00)
5: hippopotamus (0.00)
In this example, we:
- Loaded the VGG16 model with pretrained weights
- Preprocessed an image to match the model's expected input format
- Made a prediction to identify what's in the image
- Decoded the output to human-readable labels
Using TensorFlow Hub
TensorFlow Hub offers a broader range of models and makes it easy to reuse them in your applications:
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
from PIL import Image
# Load a MobileNet model from TensorFlow Hub
model_url = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/4"
model = tf.keras.Sequential([
hub.KerasLayer(model_url)
])
# Prepare the image
img_path = 'elephant.jpg'
img = Image.open(img_path).resize((224, 224))
img_array = np.array(img) / 255.0 # Normalize to [0,1]
img_array = np.expand_dims(img_array, axis=0) # Add batch dimension
# Make a prediction
predictions = model.predict(img_array)
predicted_class = np.argmax(predictions[0])
# Load ImageNet labels
labels_path = tf.keras.utils.get_file(
'ImageNetLabels.txt',
'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
with open(labels_path) as f:
labels = f.readlines()
# Print the top prediction
print(f"Prediction: {labels[predicted_class]}")
Output:
Prediction: African elephant, Loxodonta africana
Transfer Learning with Pretrained Models
One of the most powerful applications of pretrained models is transfer learning. Instead of using the model as-is, we can adapt it to our specific task.
Basic Transfer Learning Pattern
Here's a general pattern for transfer learning:
- Load a pretrained model (without the classification head)
- Freeze the base model's layers so they don't get updated during training
- Add your own classification head
- Train only the new classification head
- (Optional) Fine-tune some of the upper layers of the base model
Let's implement this pattern to classify images of cats and dogs:
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Define image dimensions and other parameters
IMG_SIZE = (224, 224)
BATCH_SIZE = 32
EPOCHS = 5
# Create image data generators with augmentation for training
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest',
validation_split=0.2
)
# Load training data
train_generator = train_datagen.flow_from_directory(
'cats_and_dogs/train',
target_size=IMG_SIZE,
batch_size=BATCH_SIZE,
class_mode='binary',
subset='training'
)
# Load validation data
validation_generator = train_datagen.flow_from_directory(
'cats_and_dogs/train',
target_size=IMG_SIZE,
batch_size=BATCH_SIZE,
class_mode='binary',
subset='validation'
)
# Load the pretrained model without the classification head
base_model = MobileNetV2(
weights='imagenet',
include_top=False,
input_shape=IMG_SIZE + (3,) # (224, 224, 3)
)
# Freeze the base model
base_model.trainable = False
# Create a new model on top
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(1024, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid') # Binary classification
])
# Compile the model
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
loss='binary_crossentropy',
metrics=['accuracy']
)
# Train the model
history = model.fit(
train_generator,
steps_per_epoch=train_generator.samples // BATCH_SIZE,
epochs=EPOCHS,
validation_data=validation_generator,
validation_steps=validation_generator.samples // BATCH_SIZE
)
# Evaluate the model
test_generator = train_datagen.flow_from_directory(
'cats_and_dogs/test',
target_size=IMG_SIZE,
batch_size=BATCH_SIZE,
class_mode='binary'
)
test_loss, test_accuracy = model.evaluate(test_generator)
print(f"Test accuracy: {test_accuracy:.4f}")
Output:
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
Epoch 1/5
500/500 [==============================] - 89s 178ms/step - loss: 0.3012 - accuracy: 0.8702 - val_loss: 0.1793 - val_accuracy: 0.9298
Epoch 2/5
500/500 [==============================] - 88s 177ms/step - loss: 0.1912 - accuracy: 0.9225 - val_loss: 0.1403 - val_accuracy: 0.9486
...
Test accuracy: 0.9511
Fine-tuning the Pretrained Model
After initial training, we can unfreeze some of the upper layers of the base model and train them with a lower learning rate for better performance:
# Unfreeze the top layers of the model
base_model.trainable = True
# Freeze all the layers except the top 10
for layer in base_model.layers[:-10]:
layer.trainable = False
# Recompile the model with a lower learning rate
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001), # Lower learning rate
loss='binary_crossentropy',
metrics=['accuracy']
)
# Continue training
history_fine = model.fit(
train_generator,
steps_per_epoch=train_generator.samples // BATCH_SIZE,
epochs=5,
validation_data=validation_generator,
validation_steps=validation_generator.samples // BATCH_SIZE
)
# Evaluate again
test_loss, test_accuracy = model.evaluate(test_generator)
print(f"Test accuracy after fine-tuning: {test_accuracy:.4f}")
Output:
Epoch 1/5
500/500 [==============================] - 119s 238ms/step - loss: 0.1120 - accuracy: 0.9559 - val_loss: 0.0996 - val_accuracy: 0.9652
...
Test accuracy after fine-tuning: 0.9702
Real-world Applications
Now let's look at some practical applications of pretrained models.
Image Feature Extraction
Pretrained models can be used as powerful feature extractors:
import tensorflow as tf
import numpy as np
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input
import matplotlib.pyplot as plt
# Load the ResNet50 model
model = ResNet50(weights='imagenet', include_top=False)
# Load and preprocess an image
img_path = 'sample_image.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
# Extract features
features = model.predict(x)
# Visualize the first feature map
plt.figure(figsize=(10, 10))
plt.imshow(features[0, :, :, 0], cmap='viridis')
plt.title('First Feature Map')
plt.colorbar()
plt.show()
print(f"Feature shape: {features.shape}") # (1, 7, 7, 2048) for ResNet50
Object Detection with TensorFlow Hub Models
Let's see how to use a pretrained object detection model from TensorFlow Hub:
import tensorflow as tf
import tensorflow_hub as hub
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load a pre-trained object detection model
detector = hub.load("https://tfhub.dev/tensorflow/faster_rcnn/resnet101_v1_640x640/1")
def detect_objects(image_path):
# Read image
img = cv2.imread(image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Convert to tensor and run detection
img_tensor = tf.convert_to_tensor(img)[tf.newaxis, ...]
results = detector(img_tensor)
# Process results
boxes = results["detection_boxes"][0].numpy()
scores = results["detection_scores"][0].numpy()
classes = results["detection_classes"][0].numpy().astype(np.int32)
# Filter out low-confidence detections
threshold = 0.5
filtered_indices = scores > threshold
boxes = boxes[filtered_indices]
scores = scores[filtered_indices]
classes = classes[filtered_indices]
# Get class names
class_names = open("coco_labels.txt").read().splitlines()
# Visualize the detections
plt.figure(figsize=(12, 12))
plt.imshow(img)
height, width, _ = img.shape
for i in range(len(boxes)):
box = boxes[i]
ymin, xmin, ymax, xmax = box
# Convert normalized coordinates to pixel values
xmin = int(xmin * width)
xmax = int(xmax * width)
ymin = int(ymin * height)
ymax = int(ymax * height)
# Draw bounding box
rect = plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,
fill=False, edgecolor='red', linewidth=2)
plt.gca().add_patch(rect)
# Add label
class_name = class_names[classes[i]-1]
plt.text(xmin, ymin-10, f"{class_name}: {scores[i]:.2f}",
color='red', fontsize=12, backgroundcolor='white')
plt.axis('off')
plt.show()
return boxes, scores, classes
# Run detection on a sample image
detect_objects("street_scene.jpg")
Style Transfer with Pretrained Models
Another exciting application is neural style transfer, which uses pretrained models to apply the style of one image to the content of another:
import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
# Load the style transfer model from TensorFlow Hub
model = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
def load_image(img_path):
img = tf.io.read_file(img_path)
img = tf.image.decode_image(img, channels=3, dtype=tf.float32)
# Add batch dimension and ensure the values are in [0, 1]
img = tf.expand_dims(img, axis=0)
return img
def show_images(images, titles=None):
plt.figure(figsize=(15, 5))
for i, img in enumerate(images):
img = np.squeeze(img)
plt.subplot(1, len(images), i+1)
plt.imshow(img)
if titles:
plt.title(titles[i])
plt.axis('off')
plt.show()
# Load content and style images
content_image = load_image('content_image.jpg')
style_image = load_image('style_image.jpg')
# Generate the stylized image
stylized_image = model(content_image, style_image)[0]
# Display all images
show_images([content_image, style_image, stylized_image],
['Content Image', 'Style Image', 'Stylized Image'])
Summary
In this tutorial, we've explored how to use pretrained models in TensorFlow for various tasks:
- We started with simple image classification using models from Keras Applications and TensorFlow Hub.
- We implemented transfer learning by taking a pretrained model and adapting it to a new task.
- We fine-tuned a model for better performance.
- We explored real-world applications including feature extraction, object detection, and style transfer.
Pretrained models are powerful tools that can dramatically reduce development time and improve the performance of your deep learning applications. By leveraging the knowledge already encoded in these models, you can build sophisticated applications even with limited data and computational resources.
Additional Resources
- TensorFlow Hub - Browse and download pretrained models
- Keras Applications Documentation - Official documentation for pretrained models in Keras
- Transfer Learning Guide - TensorFlow's official guide on transfer learning
Exercises
- Try using different pretrained models (like ResNet50, InceptionV3, or EfficientNet) for the transfer learning example and compare their performance.
- Implement transfer learning on a dataset of your choice (e.g., flowers, food, or landmark images).
- Use a pretrained model as a feature extractor and build a classifier on top of the extracted features.
- Experiment with style transfer using different content and style images.
- Implement object detection on a video stream using a pretrained model from TensorFlow Hub.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)