将word2vec嵌入到自定义LSTM pytorch模型中

时间:2019-11-12 00:47:02

标签: deep-learning pytorch lstm

我有一组输入句子。我正在使用gensim中的预训练word2vec模型来获取输入句子的嵌入。我想将这些嵌入作为输入传递给自定义pytorch LSTM模型

hidden_size = 32  
num_layers = 1
num_classes = 2

class customModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(customModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=False, bidirectional=True)
        self.fcl = nn.Linear(hidden_size*2, num_classes)

    def forward(self, x):
        # Set initial hidden and cell states 
        h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
        c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)

        # Forward propagate LSTM
        out, hidden = self.bilstm(x, (h0, c0)) 
        fw_bilstm = out[-1, :, :self.hidden_size]
        bk_bilstm = out[0, :, :self.hidden_size]
        concat_fw_bw = torch.cat((fw_bilstm, bk_bilstm), dim = 1)
        fc = self.fcl(concat_fw_bw)
        x = F.softmax(F.relu(fc))
        return x

现在,我初始化模型对象。

model = customModel(300, hidden_size, num_layers, num_classes)

获取输入句子的嵌入

sentences = [['my', 'name', 'is', 'nad'], ['i', 'love', 'nlp', 'proc']]
embedding = create_embedding(sentences)
embedding_torch = torch.FloatTensor(embedding)

现在我想将这些嵌入传递给模型以进行预测

for item in embedding_torch:
    item = item.view((1, item.size()[0], item.size()[1]))
    for epoch in range(1):  
        tag_scores = model(item)  
        print (tag_scores)

哪个抛出运行时错误

RuntimeError: Expected hidden[0] size (2, 4, 32), got (2, 1, 32)

我不确定为什么会这样。我的理解是h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)行正在正确计算隐藏维度。

我想念什么?请提出建议。

1 个答案:

答案 0 :(得分:1)

模型的主干是nn.LSTM,期望输入的大小为[sequence_length, batch_size, embedding_size]。另一方面,您要提供的模型输入的大小为[1, sequence_lenth, embedding_size]。我要做的是将nn.LSTM创建为:

# With batch_first=True
self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)

这样,模型将期望输入的大小为[batch_size, sequence_length, embedding_size]。然后,执行以下操作,而不是分别遍历批处理中的每个元素:

tag_scores = model(embedding_torch)