Question

我想使用PyTorch来计算Hessian向量乘积，其中Hessian是某个神经网络的损失函数的二阶导数矩阵，并且该向量将是该损失函数的梯度向量。

由于this post，我知道如何为常规函数计算Hessian向量积。但是，当函数是神经网络的损失函数时，我会遇到麻烦。这是因为参数被打包到模块中，可以通过nn.parameters（）而不是割炬张量进行访问。

我想做这样的事情（不起作用）：

### a simple neural network 
linear = nn.Linear(10, 20) 

x = torch.randn(1, 10) 

y = linear(x).sum()
### compute the gradient and make a copy that is detached from the graph 
grad = torch.autograd.grad(y, linear.parameters(),create_graph=True)

v = grad.clone().detach()
### compute the Hessian vector product 
z = grad @ v 
z.backward()

与此类似（起作用）：

x = Variable(torch.Tensor([1, 1]), requires_grad=True)

f = 3*x[0]**2 + 4*x[0]*x[1] + x[1]**2

grad, = torch.autograd.grad(f, x, create_graph=True)

v = grad.clone().detach()

z = grad @ v

z.backward()

This post解决了类似的问题（可能是相同的问题），但我不了解解决方案。

Answer 1

您是在说it doesn't work，但没有显示您得到什么错误，这就是为什么您没有任何答案的原因

torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False)

输出和输入应该是张量序列。但是你仅使用张量作为输出。

这是说您应该传递一个序列，所以传递[y]而不是y

PyTorch：使用nn.parameters（）计算Hessian向量积

1 个答案: