Question

来自Pytorch论坛 https://discuss.pytorch.org/t/layer-weight-vs-weight-data/24271/2 提到直接设置可变权重可能会导致

“在另一侧使用.data是可以的，但通常不建议这样做，因为在使用模型后进行更改会产生奇怪的结果，而autograd不会抛出错误。”

我想知道什么会导致奇怪的结果。另外，我还考虑直接设置优化器参数，特别是具有这些参数的优化器的梯度动量/和。对于这种情况，是否也需要考虑什么？

Answer 1

更新PyTorch图层权重是完全合法的。检查一下我们如何可以改变权重：

lin = nn.Linear(10, 2)
torch.nn.init.xavier_uniform_(lin.weight)

上层代码实际上调用with torch.no_grad()：

def _no_grad_uniform_(tensor, a, b):
    with torch.no_grad():
        return tensor.uniform_(a, b)

在下一个示例中，了解torch.no_grad()将如何帮助我们。

lin = nn.Linear(10, 2)
with torch.no_grad():
    lin.weight[0][0] = 1.

x = torch.randn(1, 10)
output = lin(x)
output.mean().backward()

如果我们不使用它：

lin = nn.Linear(10, 2)
lin.weight[0][0] = 1.
x = torch.randn(1, 10)
output = lin(x)
output.mean().backward()

我们以

结尾

RuntimeError：叶子变量已移至图形内部

因此您可以在with torch.no_grad():内部进行操作。这是因为，如果要求grad设置为True，我们对PyTorch张量执行的所有操作都会被捕获。

如果我们进行lin.weight[0][0] = 1.，我们将抓住grad_fn=<CopySlices>。问题是我们不需要被捕获，因为这是我们图层设置的一部分，而不是我们的计算。