PyTorch 中级篇(3):循环神经网络(Recurrent Neural Network)
参考代码
yunjey的 pytorch tutorial系列
循环神经网络 学习资源
一直以为,循环神经网络使用在语音处理上的,跟我这个研究计算机视觉的没有多大关系,所以一直都回避RNN。
这里居然有RNN对MINIST数据的网络实现,那就顺带把RNN给学了。
RNN结合CNN可以用于描述照片,正好能跟计算机视觉结合起来。
介绍视频(没有原理)
什么是循环神经网络 RNN (深度学习)? What is Recurrent Neural Networks (deep learning)?
相关网页
(新手向)能否简单易懂的介绍一下RNN(循环神经网络)?
一文搞懂RNN(循环神经网络)基础篇.
【译】 理解 LSTM 网络
当然这些都是基础版本的RNN,RNN的魅力在于它的各种变化版本能用来解决各种不同形式的问题。
【图片来源】循环神经网络RNN打开手册
Pytorch实现
many to one 的形式解决MINIST数据集 手写数字分类问题。
1 2 3 4 5
| import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms
|
1 2 3 4
|
torch.cuda.set_device(1) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
1 2 3 4 5 6 7 8 9 10
|
sequence_length = 28 input_size = 28 hidden_size = 128 num_layers = 2 num_classes = 10 batch_size = 100 num_epochs = 2 learning_rate = 0.01
|
MINIST 数据集
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| train_dataset = torchvision.datasets.MNIST(root='../../../data/minist/', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='../../../data/minist/', train=False, transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
|
循环神经网络搭建
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| class RNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(RNN, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) self.fc = nn.Linear(hidden_size, num_classes) def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out
|
1 2 3 4 5 6 7 8
|
model = RNN(input_size, hidden_size, num_layers, num_classes).to(device)
criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
|
训练模型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| total_step = len(train_loader) for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) loss = criterion(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() if (i+1) % 100 == 0: print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
|
Epoch [1/2], Step [100/600], Loss: 0.4569
Epoch [1/2], Step [200/600], Loss: 0.2823
Epoch [1/2], Step [300/600], Loss: 0.3512
Epoch [1/2], Step [400/600], Loss: 0.1702
Epoch [1/2], Step [500/600], Loss: 0.3181
Epoch [1/2], Step [600/600], Loss: 0.1821
Epoch [2/2], Step [100/600], Loss: 0.1540
Epoch [2/2], Step [200/600], Loss: 0.0848
Epoch [2/2], Step [300/600], Loss: 0.1985
Epoch [2/2], Step [400/600], Loss: 0.1537
Epoch [2/2], Step [500/600], Loss: 0.0988
Epoch [2/2], Step [600/600], Loss: 0.0315
测试模型并保存
1 2 3 4 5 6 7 8 9 10 11 12 13
| with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.reshape(-1, sequence_length, input_size).to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item()
print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))
|
Test Accuracy of the model on the 10000 test images: 97.47 %
1 2
| torch.save(model.state_dict(), 'model.ckpt')
|