Question

我正在尝试将使用Theano编写的代码复制到PyTorch。在代码中，作者使用

计算梯度

import theano.tensor as T    
gparams = T.grad(cost, params)

gparams的形状为(256, 240)

我尝试使用backward()，但似乎未返回任何内容。 PyTorch中是否有与grad等效的内容？

假设这是我的输入，

import torch
from torch.autograd import Variable 
cost = torch.tensor(1.6019)
params = Variable(torch.rand(1, 73, 240))

Answer 1

cost必须是涉及params的运算的结果。您仅知道两个张量的值就无法计算梯度。您还需要了解这种关系。这就是为什么pytorch在执行张量运算时会构建计算图的原因。例如，假设关系是

cost = torch.sum(params)

然后，我们期望cost相对于params的梯度将是1的向量，而与params的值无关。

可以如下计算。请注意，您需要添加requires_grad标志以指示pytorch您希望backward在调用时更新渐变。

# Initialize independent variable. Make sure to set requires_grad=true.
params = torch.tensor((1, 73, 240), requires_grad=True)

# Compute cost, this implicitly builds a computation graph which records
# how cost was computed with respect to params.
cost = torch.sum(params)

# Zero the gradient of params in case it already has something in it.
# This step is optional in this example but good to do in practice to
# ensure you're not adding gradients to existing gradients.
if params.grad is not None:
    params.grad.zero_()

# Perform back propagation. This is where the gradient is actually
# computed. It also resets the computation graph.
cost.backward()

# The gradient of params w.r.t to cost is now stored in params.grad.
print(params.grad)

结果：

tensor([1., 1., 1.])

在PyTorch中计算标量和向量之间的梯度

1 个答案: