我正在使用 PyTorch 进行分类任务。由于某种原因,上次迭代的准确度下降了,我想知道为什么?任何答案表示赞赏。
她的密码
class Classifier(nn.Module):
def __init__(self):
super(Classifier, self).__init__()
self.layers = nn.Sequential(nn.Linear(89, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 2))
def forward(self, x):
return self.layers(x)
def train(train_dl, model, epochs):
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)
for epoch in range(epochs):
for (features, target) in train_dl:
optimizer.zero_grad()
features, target = features.to(device), target.to(device)
output = model(features.float())
target = target.view(-1)
loss = loss_function(output, target)
loss.backward()
optimizer.step()
output = torch.argmax(output, dim=1)
correct = (output == target).float().sum()
accuracy = correct / 512
print(accuracy, loss)
break
model = Classifier().to(device)
train(train_dl, model, 10)
和输出的最后一部分
tensor(0.6465, device='cuda:0') tensor(0.6498, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6348, device='cuda:0') tensor(0.6574, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6582, device='cuda:0') tensor(0.6423, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6660, device='cuda:0') tensor(0.6375, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6719, device='cuda:0') tensor(0.6338, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6426, device='cuda:0') tensor(0.6523, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6699, device='cuda:0') tensor(0.6347, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6582, device='cuda:0') tensor(0.6422, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6543, device='cuda:0') tensor(0.6449, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6465, device='cuda:0') tensor(0.6502, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6992, device='cuda:0') tensor(0.6147, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6777, device='cuda:0') tensor(0.6289, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6836, device='cuda:0') tensor(0.6244, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6738, device='cuda:0') tensor(0.6315, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.1387, device='cuda:0') tensor(0.5749, device='cuda:0', grad_fn=<NllLossBackward>)
答案 0 :(得分:2)
可能是因为你的最后一个batch size小于512。最好改一下这一行
accuracy = correct / 512
到:
accuracy = correct / features.shape[0]
或者,如果您不希望最后一批具有不同的大小,您可以在创建 DataLoader 时删除它,方法是设置 drop_last=True
,如下所示:
train_dl = DataLoader(..., drop_last=True)
答案 1 :(得分:0)
我没有评论的声誉,但可能只是训练不稳定。这总是发生在第 10 个时代吗?你试过运行超过 10 个 epochs 吗?