Question

我正在尝试在pytorch中训练模型。

输入：686个数组第一层：64阵列第二层：2数组输出：前提是1或0

这是我到目前为止所拥有的：

class autoencoder(nn.Module):
    def __init__(self):
        super(autoencoder, self).__init__()
        self.encoder_softmax = nn.Sequential(
            nn.Linear(686, 256),
            nn.ReLU(True),
            nn.Linear(256, 2),
            nn.Softmax()
        )

    def forward(self, x):
        x = self.encoder_softmax(x)
        return x

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

net = net.to(device)


iterations = 10
learning_rate = 0.98
criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(
    net.parameters(), lr=learning_rate, weight_decay=1e-5)


for epoch in range(iterations):
    loss = 0.0
    print("train_dl len: ", len(train_dl))

    # net.train()
    for i, data in enumerate(train_dl, 0):
        inputs, labels, vectorize = data

        labels = labels.long().to(device)
        inputs = inputs.float().to(device)
        optimizer.zero_grad()
        outputs = net(inputs)

        train_loss = criterion(outputs, labels)

        train_loss.backward()
        optimizer.step()

        loss += train_loss.item()


    loss = loss / len(train_dl)

但是当我训练模型时，损失并没有减少。我在做什么错了？

Answer 1

您正在使用nn.CrossEntropyLoss作为损失函数，该函数应用log-softmax，但同时也在模型中应用softmax：

self.encoder_softmax = nn.Sequential(
    nn.Linear(686, 256),
    nn.ReLU(True),
    nn.Linear(256, 2),
    nn.Softmax() # <- needs to be removed
)

模型的输出应为原始logit，而不包含nn.Softmax。

您还应该降低学习率，因为0.98的学习率非常高，这会使训练变得不稳定得多，并且您很可能会看到损失振荡。更合适的学习率应在0.01或0.001的范围内。

在pytorch中训练和评估堆叠式自动编码器模型

1 个答案: