我正在训练用于时序预测的LSTM模型,并且在每个时期,我的准确性都从0重新开始,就好像我是在第一次训练。
我在培训方法摘要的下方附上
:def train(model, loader, epoch, mini_batch_size, sequence_size):
model.train()
correct = 0
padded_size = 0
size_input = mini_batch_size * sequence_size
for batch_idx, (inputs, labels, agreement_score) in enumerate(loader):
if(inputs.size(0) == size_input):
inputs = inputs.clone().reshape(mini_batch_size, sequence_size, inputs.size(1))
labels = labels.clone().squeeze().reshape(mini_batch_size*sequence_size)
agreement_score = agreement_score.clone().squeeze().reshape(mini_batch_size*sequence_size)
else:
padded_size = size_input - inputs.size(0)
(inputs, labels, agreement_score) = padd_incomplete_sequences(inputs, labels, agreement_score, mini_batch_size, sequence_size)
inputs, labels, agreement_score = Variable(inputs.cuda()), Variable(labels.cuda()), Variable(agreement_score.cuda())
output = model(inputs)
loss = criterion(output, labels)
loss = loss * agreement_score
loss = loss.mean()
optimizer.zero_grad()
loss.backward()
optimizer.step()
pred = output.data.max(1, keepdim = True)[1]
correct += pred.eq(labels.data.view_as(pred)).cuda().sum()
accuracy = 100. * correct / (len(loader.dataset) + padded_size)
print("Train: Epoch: {}, [{}/{} ({:.0f}%)]\t loss: {:.6f}, Accuracy: {}/{} ({:.0f}%)".format(
epoch,
batch_idx * len(output),
(len(loader.dataset) + padded_size),
100. * batch_idx / (len(loader.dataset)+padded_size),
loss.item(),
correct,
(len(loader.dataset) + padded_size),
accuracy))
accuracy = 100. * correct / (len(loader.dataset) + padded_size)
train_accuracy.append(accuracy)
train_epochs.append(epoch)
train_loss.append(loss.item())
据此,我的循环如下:
for epoch in range(1, 10):
train(audio_lstm_model, train_rnn_audio_loader, epoch, MINI_BATCH_SIZE, SEQUENCE_SIZE_AUDIO)
evaluation(audio_lstm_model,validation_rnn_audio_loader, epoch, MINI_BATCH_SIZE, SEQUENCE_SIZE_AUDIO)
因此,我的准确性和损失会在每个时期重新开始:
Train: Epoch: 1, [0/1039079 (0%)] loss: 0.921637, Accuracy: 0/1039079 (0%)
...
Train: Epoch: 1, [10368/1039079 (0%)] loss: 0.523242, Accuracy: 206010/1039079 (19%)
Test set: loss: 151.4845, Accuracy: 88222/523315 (16%)
Train: Epoch: 2, [0/1039079 (0%)] loss: 0.921497, Accuracy: 0/1039079 (0%)
如果有任何线索,欢迎您的帮助! 祝你有美好的一天!
答案 0 :(得分:0)
问题出在下面的事实是,序列大小对于网络来说太小,以至于无法从网络中做出一些预测。 因此,在将序列长度增加了几个数量级之后,我能够在每个时期之后改进模型。