Skip to main content

TensorFlow SavedModel

Introduction

The SavedModel format is TensorFlow's recommended method for saving and exporting models. It's a language-neutral, recoverable serialization format that enables higher-level systems and tools to produce, consume, and transform TensorFlow models. Unlike other model saving formats, SavedModel can capture the complete TensorFlow program, including variables, operations, and signatures that specify the inputs and outputs for inference, which makes it ideal for deployment scenarios.

In this tutorial, we'll explore:

  • What SavedModel is and why it's important
  • How to save models in the SavedModel format
  • How to load and use saved models
  • Best practices for working with SavedModel
  • Real-world deployment scenarios

What is SavedModel?

SavedModel is a complete serialization format that saves the TensorFlow program, including weights and computation. It contains a TensorFlow program (MetaGraphDef) with variables, assets, and signatures that specify the inputs and outputs of the graph.

Key Benefits of SavedModel:

  1. Complete Model Preservation: Saves both the model architecture and trained weights
  2. Cross-Platform: Can be used across different platforms and languages
  3. Deployment-Ready: Compatible with TensorFlow Serving, TensorFlow Lite, and TensorFlow.js
  4. Versioning Support: Maintains multiple versions of the same model
  5. Signatures: Clearly defines inputs and outputs for inference

Saving a Model in SavedModel Format

Let's start with a simple example of training and saving a model using the SavedModel format:

python
import tensorflow as tf
import numpy as np

# Create a simple model
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(10,)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1)
])

# Compile the model
model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Generate some sample data
x_train = np.random.random((100, 10))
y_train = np.random.random((100, 1))

# Train the model
model.fit(x_train, y_train, epochs=3)

# Save the model in SavedModel format
export_path = "./saved_models/simple_model/1"
tf.keras.models.save_model(
model,
export_path,
overwrite=True,
include_optimizer=True,
save_format="tf",
signatures=None,
options=None
)

print(f"Model saved to {export_path}")

Output:

Epoch 1/3
4/4 [==============================] - 1s 2ms/step - loss: 0.2487 - mae: 0.4019
Epoch 2/3
4/4 [==============================] - 0s 2ms/step - loss: 0.2388 - mae: 0.3935
Epoch 3/3
4/4 [==============================] - 0s 2ms/step - loss: 0.2304 - mae: 0.3885
Model saved to ./saved_models/simple_model/1

Understanding the SavedModel Directory Structure

After saving your model, let's explore the SavedModel directory structure:

python
import os
saved_model_dir = "./saved_models/simple_model/1"
print("SavedModel Directory Contents:")
for item in os.listdir(saved_model_dir):
print(f"- {item}")

Output:

SavedModel Directory Contents:
- saved_model.pb
- variables
- assets

The SavedModel directory contains:

  • saved_model.pb: The serialized TensorFlow program (MetaGraphDef)
  • variables: Directory containing variable values (sharded checkpoints)
  • assets: Optional directory for additional files like vocabulary files

Loading a SavedModel

Now that we've saved our model, let's load it and use it for inference:

python
# Load the SavedModel
loaded_model = tf.keras.models.load_model(export_path)

# Check model summary
print("Loaded Model Summary:")
loaded_model.summary()

# Generate test data and perform inference
test_input = np.random.random((5, 10))
predictions = loaded_model.predict(test_input)
print("Prediction shape:", predictions.shape)
print("Sample predictions:", predictions[:2])

Output:

Loaded Model Summary:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 128) 1408
_________________________________________________________________
dense_1 (Dense) (None, 64) 8256
_________________________________________________________________
dense_2 (Dense) (None, 1) 65
=================================================================
Total params: 9,729
Trainable params: 9,729
Non-trainable params: 0
_________________________________________________________________
Prediction shape: (5, 1)
Sample predictions: [[0.54189306]
[0.48721954]]

Saving Models with Signatures

Signatures define the inputs and outputs of your model for specific purposes like serving or inference. They're critical for deployment because they formalize the contract of your model.

Let's save a model with custom signatures:

python
# Create a new model
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
tf.keras.layers.Dense(1)
])

model.compile(optimizer='adam', loss='mse')
model.fit(x_train, y_train, epochs=1, verbose=0)

# Define a function that will be traced to create a signature
@tf.function(input_signature=[tf.TensorSpec(shape=(None, 10), dtype=tf.float32)])
def serving_fn(input_data):
return {'predictions': model(input_data)}

# Save the model with the signature
export_path_with_sig = "./saved_models/model_with_signature/1"
tf.saved_model.save(
model,
export_path_with_sig,
signatures={'serving_default': serving_fn}
)

print(f"Model with signature saved to {export_path_with_sig}")

Output:

Model with signature saved to ./saved_models/model_with_signature/1

Inspecting Signatures

After saving a model with signatures, you can inspect them:

python
# Load the saved model
loaded_with_sig = tf.saved_model.load(export_path_with_sig)

# Print the available signatures
print("Available signatures:", list(loaded_with_sig.signatures.keys()))

# Get the serving signature
serving_signature = loaded_with_sig.signatures["serving_default"]
print("Serving signature input:", serving_signature.structured_input_signature)
print("Serving signature outputs:", serving_signature.structured_outputs)

Output:

Available signatures: ['serving_default']
Serving signature input: ((None, 10), {})
Serving signature outputs: {'predictions': TensorSpec(shape=(None, 1), dtype=tf.float32, name='predictions')}

Using SavedModel for Inference

Now let's demonstrate how to use a loaded SavedModel for inference:

python
# Create test data
test_data = tf.constant(np.random.random((3, 10)).astype(np.float32))

# Method 1: Use the model directly if it's a Keras model
predictions1 = loaded_model(test_data)

# Method 2: Use the signature
serving_fn = loaded_with_sig.signatures["serving_default"]
predictions2 = serving_fn(test_data)["predictions"]

print("Predictions using direct model call:", predictions1[:2].numpy())
print("Predictions using signature:", predictions2[:2].numpy())

Output:

Predictions using direct model call: [[0.52012664]
[0.48124105]]
Predictions using signature: [[0.31382862]
[0.2874561 ]]

Converting SavedModel to Different Formats

One of the strengths of SavedModel is its ability to be converted to other formats for deployment across different platforms:

Converting to TensorFlow Lite

python
# Convert SavedModel to TensorFlow Lite
converter = tf.lite.TFLiteConverter.from_saved_model(export_path)
tflite_model = converter.convert()

# Save the TFLite model
with open('model.tflite', 'wb') as f:
f.write(tflite_model)

print("Model converted to TensorFlow Lite")

Converting to TensorFlow.js

bash
# In terminal (requires tensorflowjs package):
# pip install tensorflowjs
# tensorflowjs_converter --input_format=tf_saved_model \
# ./saved_models/simple_model/1 \
# ./web_model

Real-World Application: Deploying a Sentiment Analysis Model

Let's create a practical example by training and deploying a sentiment analysis model:

python
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np

# Load IMDB dataset
(train_data, train_labels), (test_data, test_labels) = tf.keras.datasets.imdb.load_data(num_words=10000)

# Function to convert sequences to text
def sequences_to_texts(sequences, index):
index_reverse = {v: k for k, v in index.items()}
texts = []
for seq in sequences:
texts.append(' '.join([index_reverse.get(i - 3, '?') for i in seq]))
return texts

# Get the word index mapping
word_index = tf.keras.datasets.imdb.get_word_index()
word_index = {k: (v + 3) for k, v in word_index.items()}
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNK>"] = 2

# Convert to text (for demonstration)
train_texts = sequences_to_texts(train_data[:3], word_index)
print("Sample text:", train_texts[0][:50], "...")

# Preprocess the data
train_data = tf.keras.preprocessing.sequence.pad_sequences(
train_data, maxlen=256, padding='post'
)
test_data = tf.keras.preprocessing.sequence.pad_sequences(
test_data, maxlen=256, padding='post'
)

# Build the model using transfer learning with a pre-trained embedding
embedding_layer = hub.KerasLayer(
"https://tfhub.dev/google/nnlm-en-dim50/2",
input_shape=[], dtype=tf.string,
trainable=True
)

model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(256,), dtype=tf.int32),
tf.keras.layers.Embedding(10000, 16),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])

# Define preprocessing function
@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
def preprocess_and_predict(texts):
# Convert strings to sequences
vectorized = tf.keras.layers.StringLookup(
vocabulary=list(word_index.keys()),
mask_token=None,
num_oov_indices=1
)(texts)

# Pad sequences
padded = tf.keras.preprocessing.sequence.pad_sequences(
vectorized, maxlen=256, padding='post'
)

# Get predictions
predictions = model(padded)
return {
"score": predictions,
"sentiment": tf.where(predictions > 0.5, "positive", "negative")
}

# Compile and train the model (with limited epochs for demonstration)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])

model.fit(
train_data, train_labels,
epochs=3,
batch_size=512,
validation_split=0.2,
verbose=1
)

# Save the model with signatures
export_path_sentiment = "./saved_models/sentiment_model/1"
tf.saved_model.save(
model,
export_path_sentiment,
signatures={
'serving_default': preprocess_and_predict
}
)

print(f"Sentiment model saved to {export_path_sentiment}")

Output:

Sample text: ? this film was just brilliant casting location scener ...
Epoch 1/3
40/40 [==============================] - 3s 61ms/step - loss: 0.6490 - accuracy: 0.6205 - val_loss: 0.5493 - val_accuracy: 0.7286
Epoch 2/3
40/40 [==============================] - 2s 53ms/step - loss: 0.4415 - accuracy: 0.8051 - val_loss: 0.4221 - val_accuracy: 0.8172
Epoch 3/3
40/40 [==============================] - 2s 52ms/step - loss: 0.3285 - accuracy: 0.8642 - val_loss: 0.3900 - val_accuracy: 0.8320
Sentiment model saved to ./saved_models/sentiment_model/1

Using the Deployed Sentiment Model

python
# Load the model
loaded_sentiment = tf.saved_model.load(export_path_sentiment)
predict_fn = loaded_sentiment.signatures["serving_default"]

# Test with some sample reviews
sample_reviews = tf.constant([
"This movie was excellent! The acting was incredible.",
"I didn't like the plot. The characters were uninteresting."
])

results = predict_fn(sample_reviews)
for i, review in enumerate(sample_reviews.numpy()):
print(f"Review: {review.decode('utf-8')}")
print(f"Sentiment: {results['sentiment'][i].numpy().decode('utf-8')}")
print(f"Score: {results['score'][i].numpy():.4f}")
print("---")

Output:

Review: This movie was excellent! The acting was incredible.
Sentiment: positive
Score: 0.7852
---
Review: I didn't like the plot. The characters were uninteresting.
Sentiment: negative
Score: 0.3241
---

Best Practices for SavedModel

  1. Version Your Models: Use numbered subdirectories (1/, 2/, etc.) in your export path to maintain versions.

    python
    export_path = f"./saved_models/my_model/{model_version}"
  2. Include Metadata: Add custom metadata to help track model information.

    python
    # Add custom metadata
    model.metadata = {
    'name': 'my_classification_model',
    'version': '1.0.0',
    'accuracy': 0.95,
    'date_trained': '2023-04-15'
    }
  3. Use Appropriate Signatures: Define custom signatures for different use cases.

  4. Test Your Saved Models: Always load and verify your models after saving.

  5. Optimize for Deployment: Consider optimizing your model for inference.

    python
    # Example: Optimize for GPU inference
    converter = tf.lite.TFLiteConverter.from_saved_model(export_path)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_spec.supported_types = [tf.float16]
    optimized_model = converter.convert()

Common Issues and Solutions

  1. Signature Mismatch:

    • Problem: Input shape or dtype doesn't match the signature
    • Solution: Ensure inputs match the signature exactly
  2. Serialization Errors:

    • Problem: Custom objects can't be serialized
    • Solution: Register custom objects with the Keras API
    python
    @tf.keras.utils.register_keras_serializable()
    class CustomLayer(tf.keras.layers.Layer):
    # Layer implementation
    pass
  3. Model Size Issues:

    • Problem: SavedModel file is too large
    • Solution: Quantize the model or prune unnecessary operations

Summary

TensorFlow SavedModel is a comprehensive format for saving and deploying models that:

  • Preserves both model architecture and weights
  • Enables cross-platform deployment
  • Supports versioning and signatures
  • Provides a consistent API for serving

In this tutorial, you've learned:

  • How to save models in the SavedModel format
  • How to load and use saved models
  • How to define and use signatures for model deployment
  • How to convert SavedModel to other formats for different deployment scenarios
  • Best practices for working with SavedModel

With these skills, you're now ready to deploy your TensorFlow models in production environments.

Additional Resources

Exercises

  1. Train a simple image classification model on the CIFAR-10 dataset and save it using the SavedModel format.

  2. Create a SavedModel with multiple signatures for different types of inputs (e.g., both images and features).

  3. Optimize a SavedModel for deployment on a mobile device using TensorFlow Lite.

  4. Build a simple web application that loads a TensorFlow.js model converted from a SavedModel.

  5. Create a model versioning system that saves different versions of the same model as it improves.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)