Question

我尝试使用torch.nn.LSTM实现两层双向LSTM。

我举了一个玩具示例：一组3个张量，它们完全相同（请参见下面的代码）。而且我希望BiLSTM的输出沿批处理维度相同，即out[:,0,:] == out[:,1,:] == out[:, 2, :]。

但事实并非如此。根据我的实验，在20％〜40％的时间内，输出是不一样的。所以我想知道哪里错了。

# Python 3.6.6, Pytorch 0.4.1
import torch

def test(hidden_size, in_size):
    seq_len, batch = 4, 3
    bilstm = torch.nn.LSTM(input_size=in_size, hidden_size=hidden_size, 
                            num_layers=2, bidirectional=True)

    # create a batch with 3 exactly the same tensors
    a = torch.rand(seq_len, 1, in_size)  # (seq_len, 1, in_size)
    x = torch.cat((a, a, a), dim=1)

    out, _ = bilstm(x)  # (seq_len, batch, n_direction * hidden_size)

    # expect the output should be the same along the batch dimension
    assert torch.equal(out[:, 0, :], out[:, 1, :])  
    assert torch.equal(out[:, 1, :], out[:, 2, :])

if __name__ == '__main__':
    count, total = 0, 0
    for h_size in range(1, 51):
        for in_size in range(1, 51):
            total += 1
            try:
                test(h_size, in_size)
            except AssertionError:
                count += 1
    print('percentage of assertion error:', count / total)

Answer 1

让您感到困惑的是浮点精度。浮点运算有点不准确，并且可能相差很小改用它：

torch.set_default_dtype(torch.float64)

然后您将看到批次暗处的它们应该相同。

感谢纠正某些英语语法错误。

Answer 2

我在使用 GRU 时遇到了同样的问题，以下为我解决了这个问题。
在测试前设置手动种子并将模型设置为评估模式：

torch.manual_seed(42)
bilstm.eval()  # or: bilstm.train(false)

来源： LSTMcell and LSTM returning different outputs

此外，我必须在每次调用模型之前设置相同的种子（在测试期间）。在你的情况下：

torch.manual_seed(42)
out, _ = bilstm(x)  # (seq_len, batch, n_direction * hidden_size)

PyTorch：nn.LSTM对于同一批次中的相同输入输出不同的结果

2 个答案: