我正在尝试将使用Theano编写的代码复制到PyTorch。在代码中,作者使用
计算梯度import theano.tensor as T
gparams = T.grad(cost, params)
gparams
的形状为(256, 240)
我尝试使用backward()
,但似乎未返回任何内容。 PyTorch中是否有与grad
等效的内容?
假设这是我的输入,
import torch
from torch.autograd import Variable
cost = torch.tensor(1.6019)
params = Variable(torch.rand(1, 73, 240))
答案 0 :(得分:0)
cost
必须是涉及params
的运算的结果。您仅知道两个张量的值就无法计算梯度。您还需要了解这种关系。这就是为什么pytorch在执行张量运算时会构建计算图的原因。例如,假设关系是
cost = torch.sum(params)
然后,我们期望cost
相对于params
的梯度将是1的向量,而与params
的值无关。
可以如下计算。请注意,您需要添加requires_grad
标志以指示pytorch您希望backward
在调用时更新渐变。
# Initialize independent variable. Make sure to set requires_grad=true.
params = torch.tensor((1, 73, 240), requires_grad=True)
# Compute cost, this implicitly builds a computation graph which records
# how cost was computed with respect to params.
cost = torch.sum(params)
# Zero the gradient of params in case it already has something in it.
# This step is optional in this example but good to do in practice to
# ensure you're not adding gradients to existing gradients.
if params.grad is not None:
params.grad.zero_()
# Perform back propagation. This is where the gradient is actually
# computed. It also resets the computation graph.
cost.backward()
# The gradient of params w.r.t to cost is now stored in params.grad.
print(params.grad)
结果:
tensor([1., 1., 1.])