TensorFlow Sequential Model
Introduction
The Sequential model is the simplest and most straightforward way to build neural networks in TensorFlow. It allows you to stack layers one after another in a linear fashion, making it perfect for beginners who are just starting with deep learning. In this tutorial, we'll explore how to create, train, and evaluate neural networks using TensorFlow's Sequential API.
What is a Sequential Model?
The Sequential model is a linear stack of layers where you add one layer at a time. Think of it as building blocks stacked on top of each other, where data flows from the first layer through each subsequent layer until it reaches the output.
This type of model is perfect for:
- Feed-forward neural networks
- Simple models where data flows straight through from input to output
- Beginners who are learning the fundamentals of deep learning
Setting Up Your Environment
Before we dive into creating Sequential models, make sure you have TensorFlow installed:
# Install TensorFlow if you haven't already
# !pip install tensorflow
# Import the necessary libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
Creating Your First Sequential Model
Let's build a simple neural network for recognizing handwritten digits using the MNIST dataset:
# Create a Sequential model
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)), # Input layer (flattens 28x28 images into 784-length vector)
keras.layers.Dense(128, activation='relu'), # Hidden layer with 128 neurons
keras.layers.Dense(10, activation='softmax') # Output layer with 10 neurons (one for each digit)
])
Here's what each layer does:
Flatten
: Transforms the 2D image (28x28 pixels) into a 1D vector (784 values)Dense(128, activation='relu')
: A fully connected layer with 128 neurons using ReLU activationDense(10, activation='softmax')
: Output layer with 10 neurons (for digits 0-9) using softmax activation for probability distribution
Alternative Ways to Build a Sequential Model
You can also build a Sequential model by starting with an empty model and adding layers step by step:
# Create an empty Sequential model
model = keras.Sequential()
# Add layers one by one
model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
This approach gives you more flexibility as you can conditionally add layers based on your requirements.
Compiling Your Model
Before training, you need to compile the model by specifying:
- An optimizer: How the model updates itself based on the data and loss
- A loss function: How the model measures its performance
- Metrics: What results you want to track during training
model.compile(
optimizer='adam', # Adam optimizer
loss='sparse_categorical_crossentropy', # Loss function for classification
metrics=['accuracy'] # Track accuracy during training
)
Model Summary
To see what your model looks like, use the summary()
method:
model.summary()
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 784) 0
_________________________________________________________________
dense (Dense) (None, 128) 100480
_________________________________________________________________
dense_1 (Dense) (None, 10) 1290
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________
The summary tells you:
- The layers in your model
- The output shape of each layer
- The number of parameters (weights and biases) in each layer
Training Your Model
Let's load the MNIST dataset and train our model:
# Load and prepare the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize pixel values to be between 0 and 1
x_train = x_train / 255.0
x_test = x_test / 255.0
# Train the model
history = model.fit(
x_train, y_train,
epochs=5,
batch_size=64,
validation_split=0.2
)
Output:
Epoch 1/5
750/750 [==============================] - 3s 3ms/step - loss: 0.2609 - accuracy: 0.9252 - val_loss: 0.1419 - val_accuracy: 0.9573
Epoch 2/5
750/750 [==============================] - 2s 3ms/step - loss: 0.1137 - accuracy: 0.9658 - val_loss: 0.1089 - val_accuracy: 0.9677
Epoch 3/5
750/750 [==============================] - 2s 3ms/step - loss: 0.0782 - accuracy: 0.9761 - val_loss: 0.0909 - val_accuracy: 0.9724
Epoch 4/5
750/750 [==============================] - 2s 3ms/step - loss: 0.0576 - accuracy: 0.9824 - val_loss: 0.0879 - val_accuracy: 0.9747
Epoch 5/5
750/750 [==============================] - 2s 3ms/step - loss: 0.0443 - accuracy: 0.9862 - val_loss: 0.0771 - val_accuracy: 0.9773
Evaluating Your Model
After training, evaluate the model on the test dataset:
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc:.4f}')
Output:
313/313 [==============================] - 1s 2ms/step - loss: 0.0754 - accuracy: 0.9780
Test accuracy: 0.9780
Making Predictions
Let's make predictions using our trained model:
# Make predictions
predictions = model.predict(x_test)
# The predictions are probabilities for each class (digit 0-9)
# Let's see the predicted class for the first test image
print(f"Predicted digit: {np.argmax(predictions[0])}")
print(f"Actual digit: {y_test[0]}")
# Visualize the first test image
plt.figure(figsize=(4, 4))
plt.imshow(x_test[0], cmap='gray')
plt.title(f"Predicted: {np.argmax(predictions[0])}, Actual: {y_test[0]}")
plt.axis('off')
plt.show()
Real-world Example: Sentiment Analysis
Now let's look at a more practical example: sentiment analysis of movie reviews using the IMDB dataset.
# Load the IMDB dataset
(train_data, train_labels), (test_data, test_labels) = keras.datasets.imdb.load_data(num_words=10000)
# Prepare the data
def vectorize_sequences(sequences, dimension=10000):
# Create an all-zero matrix of shape (len(sequences), dimension)
results = np.zeros((len(sequences), dimension))
for i, sequence in enumerate(sequences):
results[i, sequence] = 1. # set specific indices of results[i] to 1s
return results
# Vectorize data
x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)
# Convert labels to numpy arrays
y_train = np.asarray(train_labels).astype('float32')
y_test = np.asarray(test_labels).astype('float32')
# Define the model
sentiment_model = keras.Sequential([
keras.layers.Dense(16, activation='relu', input_shape=(10000,)),
keras.layers.Dense(16, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')
])
# Compile the model
sentiment_model.compile(
optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy']
)
# Train the model
sentiment_history = sentiment_model.fit(
x_train, y_train,
epochs=4,
batch_size=512,
validation_split=0.2
)
# Evaluate the model
results = sentiment_model.evaluate(x_test, y_test)
print(f"Test accuracy: {results[1]:.4f}")
This model can classify movie reviews as positive or negative with over 85% accuracy.
Advanced Model Features
As you become more comfortable with Sequential models, you can add more advanced features:
Adding Dropout for Regularization
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.2), # Add dropout to prevent overfitting
keras.layers.Dense(10, activation='softmax')
])
Adding Batch Normalization
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128),
keras.layers.BatchNormalization(), # Normalize activations
keras.layers.Activation('relu'),
keras.layers.Dense(10, activation='softmax')
])
When to Use the Sequential Model
The Sequential model is great for:
- Beginners learning deep learning concepts
- Simple models where layers are stacked linearly
- Quick prototyping and experimentation
However, it has limitations when you need:
- Models with multiple inputs or outputs
- Models with shared layers
- Models with complex architectures (like residual connections)
In those cases, you'll want to look into the Functional API or Model Subclassing in TensorFlow.
Summary
In this tutorial, you've learned:
- What the Sequential model is and when to use it
- How to create and compile a Sequential model
- How to train and evaluate your model
- How to use your model for making predictions
- A practical example using sentiment analysis
- Advanced techniques like dropout and batch normalization
The Sequential API is a powerful starting point for your deep learning journey. As you become more comfortable with these concepts, you can explore more complex model architectures and techniques.
Additional Resources
- TensorFlow Documentation on Sequential Model
- Keras Sequential Model API
- Deep Learning with Python by François Chollet
Exercise
- Create a Sequential model for the Fashion MNIST dataset (a more challenging version of MNIST with clothing items instead of digits).
- Experiment with different architectures by adding more layers or changing the number of neurons.
- Try different activation functions like
tanh
orleaky_relu
and observe how they affect performance. - Implement early stopping to prevent overfitting.
- Visualize the learning curves of your model to identify overfitting or underfitting.
With the Sequential API in your toolkit, you're well on your way to building effective neural networks!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)