为什么最后一批的准确率下降了?

时间:2021-05-08 14:20:09

标签: python pytorch

我正在使用 PyTorch 进行分类任务。由于某种原因,上次迭代的准确度下降了,我想知道为什么?任何答案表示赞赏。

她的密码

class Classifier(nn.Module):                          
    def __init__(self):                                    
        super(Classifier, self).__init__()             
        self.layers = nn.Sequential(nn.Linear(89, 128),   
                                    nn.ReLU(),              
                                    nn.Linear(128, 64),      
                                    nn.ReLU(),              
                                    nn.Linear(64, 2))       
    def forward(self, x):               
        return self.layers(x)

def train(train_dl, model, epochs):  
    loss_function = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.1)
    for epoch in range(epochs):
        for (features, target) in train_dl:      
            optimizer.zero_grad() 
            features, target = features.to(device), target.to(device)
            output = model(features.float())
            target = target.view(-1) 
            loss = loss_function(output, target)
            loss.backward()  
            optimizer.step()
            output = torch.argmax(output, dim=1)
            correct = (output == target).float().sum()
            accuracy = correct / 512
            print(accuracy, loss)
        break
        
model = Classifier().to(device)
train(train_dl, model, 10)

和输出的最后一部分

tensor(0.6465, device='cuda:0') tensor(0.6498, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6348, device='cuda:0') tensor(0.6574, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6582, device='cuda:0') tensor(0.6423, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6660, device='cuda:0') tensor(0.6375, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6719, device='cuda:0') tensor(0.6338, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6426, device='cuda:0') tensor(0.6523, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6699, device='cuda:0') tensor(0.6347, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6582, device='cuda:0') tensor(0.6422, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6543, device='cuda:0') tensor(0.6449, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6465, device='cuda:0') tensor(0.6502, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6992, device='cuda:0') tensor(0.6147, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6777, device='cuda:0') tensor(0.6289, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6836, device='cuda:0') tensor(0.6244, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.6738, device='cuda:0') tensor(0.6315, device='cuda:0', grad_fn=<NllLossBackward>)
tensor(0.1387, device='cuda:0') tensor(0.5749, device='cuda:0', grad_fn=<NllLossBackward>)

2 个答案:

答案 0 :(得分:2)

可能是因为你的最后一个batch size小于512。最好改一下这一行

accuracy = correct / 512

到:

accuracy = correct / features.shape[0]

或者,如果您不希望最后一批具有不同的大小,您可以在创建 DataLoader 时删除它,方法是设置 drop_last=True,如下所示:

train_dl = DataLoader(..., drop_last=True)

答案 1 :(得分:0)

我没有评论的声誉,但可能只是训练不稳定。这总是发生在第 10 个时代吗?你试过运行超过 10 个 epochs 吗?