使用pytorch使用GRU生成句子向量

时间:2018-08-04 20:22:18

标签: python pytorch rnn gated-recurrent-unit

我正在尝试在pytorch中使用GRU生成句子向量。我的故事被分成句子。所以我读了这些故事,用单词将句子分开,然后将它们转换成整数。例如,具有2个样品的批次为:

(0 ,.,.) = 
19  21  28  3    0    0
25  16  28  4    17   0
0   0    0  0    0    0
(1 ,.,.) =  
19  21  28  3    0    0
25  16  28  4    0    0
15  28  26  27   17   0

每一行是一个句子,每个数字是一个单词。我想要的是每一行都有一个矢量表示。假设尺寸为5:

(0 ,.,.) = 
 0.1619 -0.0605 -0.3301 -0.0433  0.2786
-0.2069 -0.3152 -0.4366  0.1272  0.3375
 0       0       0       0       0
(1 ,.,.) = 
-0.1599 -0.0730 -0.3796  0.0214  0.2157
-0.0805 -0.1307 -0.3942  0.0648  0.2704
 0.0275 -0.2353 -0.4399  0.0687  0.3218

以前,我是通过对单词向量求和来生成句子向量的,模型准确性为99%。使用此代码(带有GRU的编码器),我只能获得30%的值:

embedding_dim =  5 
input_embed= embedding(story.view(story.size(0), -1)) 
slen = torch.LongTensor([2,3])
sentlens = torch.DoubleTensor([[4,5,0],[4,4,5]])
def encoder_gru(input_embed,slen,sentlens,n_layers=1,hidden=None):
    """ for one sample 
    Args: 
       input_embed : embedding of input text of size batch*nbrSent*nbrWord*dim
              slen : tensor, number of sentences in the story
          sentlens : tensor, number of words in each sentences
            hidden : tensor, init hidden layer of gru
    """
    # choose not-padded sentences from story
    batch_size = input_embed.size(0)

    # contains sentence vectors
    hidden_batch = torch.zeros(batch_size,max(slen),embedding_dim)
    st = 0

    for b in range(batch_size):
        iembed = input_embed[b,0:slen[b]] # takes non-padded rows 
        bsent = sentlens[b,0:slen[b]] # takes number of words in non-padded sentences
        # get ordered pack
        sorted_slens,idx = bsent.sort(0,descending=True)
        sorted_iembed = iembed[idx]
        pack = torch.nn.utils.rnn.pack_padded_sequence(sorted_iembed, sorted_slens.tolist(), batch_first=True)
        h0 = Variable(torch.randn(n_layers,slen[b],embedding_dim))
        out,hidden_out = self.gru(pack,h0)
        # unpacked, unpacked_len = torch.nn.utils.rnn.pad_packed_sequence(out, batch_first=True)
        _,inv_idx = idx.sort() # sort the outputs again, undo 1st order
        hidden_batch[b,0:slen[b]] = hidden_out[-1][inv_idx].data.clone()
    return hidden_batch

我的问题是:

  1. 我可以生成向量,但是我的整个模型的准确性都比以前差,我无法弄清楚此步骤出了什么问题。
  2. 除了使用整个模型进行测试之外,我如何才能测试句子向量的良好表示形式?
  3. 是否有更聪明的方法来执行此操作而不在批处理中使用for循环?

0 个答案:

没有答案