Building a Simple RNN with PyTorch: A Step-by-Step Guide

Are you ready to dive into the world of Recurrent Neural Networks (RNNs) and explore their potential for sequential data analysis? Look no further! In this blog post, we will walk through an example of building a simple RNN neural network using PyTorch, a popular deep learning framework.

Let’s start by importing the necessary libraries:

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms

We will be using the torch library for creating and training the neural network, as well as the torchvision library for loading the MNIST dataset.

Creating the RNN Architecture

In this example, let’s define our RNN model using the nn.RNN module:

#create RNN
class RNN(nn.Module):
    
    def __init__(self,input_size,hidden_size,num_layer,num_classes=10):
        super(RNN,self).__init__()
        self.hidden_size=hidden_size
        self.num_layer=num_layer
        self.rnn=nn.RNN(input_size,hidden_size,num_layer,batch_first=True)
        self.fc=nn.Linear(hidden_size*seq_len,num_classes)


    def forward(self,x):
        h0=torch.zeros(self.num_layer,x.size(0),self.hidden_size).to(device)
        out,_=self.rnn(x,h0)
        out=out.reshape(out.shape[0],-1)
        out=self.fc(out)
        return out

Setting Up the Environment and hyperparameters

Before we proceed, we need to set up some environment configurations. We will define the device on which the network will be trained, whether it is a GPU or CPU. We will also set some hyperparameters for training the network.

# Set device
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Hyperparameters
input_size=28
seq_len=28
num_layer=2
hidden_size=256
num_classes=10
learning_rate=0.001
batch_size=64
num_epoch=2

Here, we check for the availability of a GPU and assign the appropriate device. We also specify the hyperparameters, including the input size, number of classes, learning rate, batch size, and number of epochs. The MNIST dataset is loaded using the datasets.MNIST class from torchvision and transformed into tensors.

Loading the Dataset

Next, we will load the MNIST dataset. PyTorch provides a convenient API to download and load common datasets like MNIST.

# Load the dataset
train_dataset = datasets.MNIST(root='datasets/', train=True, transform=transforms.ToTensor(), download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

test_dataset = datasets.MNIST(root='datasets/', train=False, transform=transforms.ToTensor(), download=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)

We create two Dataset objects for training and testing, specifying the root directory for storing the dataset, whether it is for training or testing, and the transformation to be applied to the data (converting images to tensors). We also create two DataLoader objects, which allow us to iterate over the dataset in batches. The batch_size parameter determines the number of samples to be included in each batch, and shuffle=True shuffles the dataset before each epoch to ensure randomness in the training process.

Initializing the Network, Loss Function, and Optimizer

Now, we will initialize our neural network model, define the loss function, and choose an optimizer for training the network.

# Initialize the network
model = RNN().to(device)
# Loss function and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

We create an instance of our CNN class and move it to the chosen device using the .to(device) method. This ensures that the computations will be performed on either a GPU or CPU, depending on the availability. We use the nn.CrossEntropyLoss() function as our loss function since we are dealing with a classification task. The optim.Adam() function is used to initialize the Adam optimizer, which will update the weights of our model during training.

Training the Network

Now, let’s train our CNN by iterating over the dataset for the specified number of epochs, performing forward and backward passes, and updating the model’s weights.

# Train the network
for epoch in range(num_epochs):
    for batch_idx, (data, targets) in enumerate(train_loader):
        data = data.squeeze(1).to(device)
        targets = targets.to(device)

        # Forward pass
        scores = model(data)
        loss = loss_fn(scores, targets)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Within each epoch, we iterate over the batches in the training data. We move the data and targets to the device. Then, we compute the forward pass by passing the data through our RNN model and calculate the loss. Next, we perform the backward pass to compute the gradients of the loss with respect to the model’s parameters. Finally, we update the model’s parameters using the optimizer’s step() method.

Evaluating the Model

After training the network, we want to evaluate its performance on both the training and testing datasets.

# Check accuracy on both train and test sets

def check_accuracy(loader, model):
    if loader.dataset.train:
        print("Checking accuracy on train dataset")
    else:
        print("Checking accuracy on test dataset")

    num_correct = 0
    num_samples = 0
    model.eval()

    with torch.no_grad():
        for x, y in loader:
            x = x.to(device).squeeze(1)
            y = y.to(device)

            scores = model(x)
            _, pred = scores.max(1)
            num_correct += (pred == y).sum().item()
            num_samples += pred.size(0)
        
        accuracy = 100.0 * num_correct / num_samples
        print(f"Accuracy: {accuracy:.2f}%")

    model.train()

check_accuracy(train_loader, model)
check_accuracy(test_loader, model)

The check_accuracy function takes a data loader and a model as input and calculates the accuracy of the model’s predictions on the given dataset. We iterate over the dataset, move the data to the device, reshape it, obtain the predicted scores from the model, calculate the number of correct predictions, and calculate the overall accuracy. The model is set to evaluation mode (model.eval()) to disable gradient computation and speed up inference.

Conclusion

Congratulations! You have successfully built a simple RNN neural network using PyTorch. In this blog post, we covered the step-by-step process of creating the RNN architecture, training the model, and evaluating its performance on the MNIST dataset. RNNs are powerful models for sequential data analysis and have various applications in natural language processing, speech recognition, and time series forecasting.

Feel free to explore further by experimenting with different hyperparameters, adding more layers to the network, or applying the RNN model to other datasets. PyTorch provides a flexible and intuitive interface for deep learning, allowing you to unleash your creativity and build even more sophisticated models.

Stay tuned for more exciting tutorials and examples on deep learning with PyTorch!

Happy coding!

References:
https://pytorch.org/
https://pytorch.org/tutorials/
https://www.youtube.com/watch?v=Jy4wM2X21u0&list=PLhhyoLH6IjfxeoooqP9rhU3HJIAVAJ3Vz&index=4&ab_channel=AladdinPersson

Leave a comment