Question

a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = torch.nn.Parameter(c, requires_grad=True,)
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad)
    print(b.grad)
    print(c.grad)
    print(d.grad)

打印出

None
tensor([-0.8707])
None
tensor([-1.1125])

如何学习a和c的梯度？变量d需要保留一个参数

Answer 1

基本上，当您创建新的张量（例如torch.nn.Parameter()或torch.tensor()）时，您是在创建叶节点张量。

当您执行类似c=a+1的操作时，c将成为中间节点。您可以print(c.is_leaf)检查张量是否为叶节点。在默认情况下，Pytorch不会计算中间节点的梯度。

在您的代码段中，a，b，d都是叶节点张量，c是中间节点。由于pytorch不会计算中间节点的梯度，因此c.grad将None。调用a时，loss.backword()与图表是隔离的。这就是a.grad也是None的原因。

如果您将代码更改为此

a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = c
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad) # Not None
    print(b.grad) # Not None
    print(c.grad) # None
    print(d.grad) # None

您会发现a和b具有渐变，但是c.grad和d.grad是None，因为它们是中间节点。

pytorch-没有为参数计算梯度

1 个答案: