我在Mnist数据上有一层带有pytorch的层lstm。我知道对于pytorch中的lstm的一层lstm dropout选项不能运行。所以,我在第二层的开头添加了一个丢弃,这是一个完全连接的层。但是,我观察到没有辍学我在测试数据上得到97.75%的准确度,而辍学0.5我得到95.36%。我想问一下我做错了什么或者出现这种现象的原因是什么? 我在测试中将其更改为eval模式,但准确度达到96.44%。它仍然少于没有辍学。 非常感谢
# RNN Model (Many-to-One)
class RNN(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers,
batch_first=True,bidirectional=True)
self.fc = nn.Sequential(
nn.Dropout(0.1),
nn.Linear(hidden_size*2, num_classes),
nn.Softmax(dim=1)
)
def init_hidden(self,x):
return(Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size)).cuda(),
Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).cuda()))
def forward(self, x):
# Set initial states
# Forward propagate RNN
hidden = self.init_hidden(x)
#print(len(hidden))
out, _ = self.lstm(x, hidden)
# Decode hidden state of last time step
out = self.fc(out[:, -1, :])
return out