我正在尝试使用AutoGrad计算某些代码的派生形式。 该代码的一部分由在PyTorch中实现的神经网络组成。 但是,使用AutoGrad来计算NN的导数时会遇到一些麻烦。
我创建了一个小脚本来重现该问题:
import torch
import autograd.numpy as np
from autograd import grad
inputDimension = 10
hiddenLayerDimension = 10
outputDimension = 1
model = torch.nn.Sequential(
torch.nn.Linear(inputDimension, hiddenLayerDimension),
torch.nn.ReLU(),
torch.nn.Linear(hiddenLayerDimension, outputDimension),
)
def functionToDifferentiate(input):
# This line below represents the 'other' calculations. In reality it is more involved
scaledInput = input * 3
inputTensor = torch.from_numpy(scaledInput).type(torch.FloatTensor)
return model(inputTensor)
randomInput = np.random.rand(inputDimension)
gradientFunctionOfModel = grad(functionToDifferentiate)
print(functionToDifferentiate(randomInput))
print(gradientFunctionOfModel(randomInput))
运行此代码时,最后一行崩溃,并显示以下堆栈跟踪:
Traceback (most recent call last):
File "replaced_for_privacy_reasons/stackOverFlowQuestion.py", line 25, in <module>
print(gradientFunctionOfModel(randomInput))
File "replaced_for_privacy_reasons/venv/lib/python3.6/site-packages/autograd/wrap_util.py", line 20, in nary_f
return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
File "replaced_for_privacy_reasons/venv/lib/python3.6/site-packages/autograd/differential_operators.py", line 24, in grad
vjp, ans = _make_vjp(fun, x)
File "replaced_for_privacy_reasons/venv/lib/python3.6/site-packages/autograd/core.py", line 10, in make_vjp
end_value, end_node = trace(start_node, fun, x)
File "replaced_for_privacy_reasons/venv/lib/python3.6/site-packages/autograd/tracer.py", line 10, in trace
end_box = fun(start_box)
tensor([0.0228], grad_fn=<AddBackward0>)
File "replaced_for_privacy_reasons/venv/lib/python3.6/site-packages/autograd/wrap_util.py", line 15, in unary_f
return fun(*subargs, **kwargs)
File "replaced_for_privacy_reasons/stackOverFlowQuestion.py", line 18, in functionToDifferentiate
inputTensor = torch.from_numpy(input).type(torch.FloatTensor)
TypeError: expected np.ndarray (got ArrayBox)
我确实知道可以直接将我的输入创建为PyTorch张量,而不是作为Numpy向量并计算这些向量的梯度。 但是,这对我来说不是一个好选择,因为该神经网络只是我要为其计算梯度的完整函数的一部分。
我们非常感谢您的帮助