Question

我目前正在将PyTorch用于深度神经网络。我写了一个如下所示的玩具神经网络，发现是否为标签requires_grad=True设置y会有很大的不同。当y.requires_grad=True时，神经网络发散。我想知道为什么会这样。

import torch

x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
y = x.pow(2) + 10 * torch.rand(x.size())


x.requires_grad = True
# this is where problem occurs
y.requires_grad = True

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()
        self.hidden = torch.nn.Linear(n_feature, n_hidden)
        self.predict = torch.nn.Linear(n_hidden, n_output)

    def forward(self, x):
        x = torch.relu(self.hidden(x))
        x = self.predict(x)
        return x

net = Net(1, 10, 1)
optimizer = torch.optim.SGD(net.parameters(), lr=0.5)
criterion = torch.nn.MSELoss()


for t in range(200):
    y_pred = net(x)

    loss= criterion(y_pred, y)

    optimizer.zero_grad()
    loss.backward()
    print("Epoch {}: {}".format(t, loss))
    optimizer.step()

Answer 1

似乎您使用的是PyTorch的过时版本。在较新的版本（0.4.0+）中，这将引发以下错误：

AssertionError: nn criterions don't compute the gradient w.r.t. targets - 
                please mark these tensors as not requiring gradients

从本质上讲，它告诉您只有将目标的requires_grad标志设置为False时，它才会起作用。在以前的版本中完全可以使用此功能的原因确实很有趣，并且它也引起了不同的行为。

我的猜测是，向后传递会改变您的目标（而不只是改变您的体重），这显然是您不希望的。

设置require_grad = True

1 个答案: