PyTorch 中级篇(4):双向循环神经网络(Bidirectional Recurrent Neural Network)
参考代码
yunjey的 pytorch tutorial系列
双向循环神经网络 学习资源
论文原文
Bidirectional recurrent neural networks
原文PDF
其他资源
吴恩达Deeplearning.ai项目中的关于Bidirectional RNN一节的视频教程
RNN11. Bidirectional RNN
博客 双向循环神经网络及TensorFlow实现
Pytorch实现
使用双向循环神经网络 many to one 的形式解决MINIST数据集 手写数字分类问题。
1 2 3 4 5
| import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms
|
1 2 3 4
|
torch.cuda.set_device(1) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
1 2 3 4 5 6 7 8 9 10
|
sequence_length = 28 input_size = 28 hidden_size = 128 num_layers = 2 num_classes = 10 batch_size = 100 num_epochs = 2 learning_rate = 0.003
|
MINIST数据集
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| train_dataset = torchvision.datasets.MNIST(root='../../../data/minist/', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='../../../data/minist/', train=False, transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
|
搭建双向循环神经网络(many to one)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| class BiRNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(BiRNN, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True) self.fc = nn.Linear(hidden_size*2, num_classes) def forward(self, x): h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device) c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out
|
1 2
| model = BiRNN(input_size, hidden_size, num_layers, num_classes).to(device)
|
1 2 3
| criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
|
训练模型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| total_step = len(train_loader) for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) loss = criterion(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() if (i+1) % 100 == 0: print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
|
Epoch [1/2], Step [100/600], Loss: 0.7892
Epoch [1/2], Step [200/600], Loss: 0.3596
Epoch [1/2], Step [300/600], Loss: 0.1456
Epoch [1/2], Step [400/600], Loss: 0.0966
Epoch [1/2], Step [500/600], Loss: 0.0878
Epoch [1/2], Step [600/600], Loss: 0.1667
Epoch [2/2], Step [100/600], Loss: 0.0199
Epoch [2/2], Step [200/600], Loss: 0.0555
Epoch [2/2], Step [300/600], Loss: 0.0203
Epoch [2/2], Step [400/600], Loss: 0.0550
Epoch [2/2], Step [500/600], Loss: 0.0468
Epoch [2/2], Step [600/600], Loss: 0.1018
测试和保存模型
1 2 3 4 5 6 7 8 9 10 11 12 13
| with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item()
print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))
|
Test Accuracy of the model on the 10000 test images: 97.73 %
1 2
| torch.save(model.state_dict(), 'model.ckpt')
|