Question

我正在尝试从PyTorch内置的简单网络中提取权重和偏差。我的整个网络由nn.Linear层组成。当我通过调用nn.Linear(in_dim, out_dim)创建图层时，我希望从该模型的调用model.parameters()得到的参数的形状为权重(in_dim, out_dim)，形状为(out_dim)。偏压。但是，model.parameters()产生的权重而不是形状(out_dim, in_dim)。

我的代码的目的是能够使用矩阵乘法仅使用numpy而不使用任何PyTorch来执行前向传递。由于形状不一致，矩阵乘法会引发错误。我该如何解决？

这是我的确切代码：

class RNN(nn.Module):

    def __init__(self, dim_input, dim_recurrent, dim_output):

        super(RNN, self).__init__()

        self.dim_input = dim_input
        self.dim_recurrent = dim_recurrent
        self.dim_output = dim_output

        self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
        self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
        self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
        self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
        self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)

#There is a defined forward pass

model = RNN(12, 100, 6)

for i in model.parameters():
    print(i.shape())

输出为：

torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])

如果我正确的话，输出应该是：

torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])

我的问题是什么？

Answer 1

您看到的不是（out_dim，in_dim），它只是权重矩阵的形状。致电print(model)时，您会看到输入和输出功能正确：

RNN(
  (dense1): Linear(in_features=12, out_features=100, bias=True)
  (dense2): Linear(in_features=100, out_features=100, bias=False)
  (dense3): Linear(in_features=12, out_features=100, bias=True)
  (dense4): Linear(in_features=100, out_features=100, bias=False)
  (dense5): Linear(in_features=100, out_features=6, bias=True)
)

在致电matmul之前，您可以检查源代码以查看权重是否已移位。

nn.Linear在这里定义：
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear

您可以检查forward，它看起来像这样：

def forward(self, input):
    return F.linear(input, self.weight, self.bias)

F.linear在这里定义：
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html

乘以权重的相应行是：

output = input.matmul(weight.t())

如上所述，您可以看到在应用matmul之前权重已转置，因此权重的形状与您预期的不同。

因此，如果要手动进行矩阵乘法，请执行以下操作：

# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)

就像您从dense1所期望的那样，它将返回：

torch.Size([5, 100])

我希望这可以用形状来解释您的观察：）

pytorch模型的形状。参数与模型中的定义方式不一致

1 个答案: