Pytorch seq2seq学习 - 使用word2vec

时间:2018-04-27 14:35:20

标签: nlp pytorch seq2seq

我正在关注seq2seq教程here

我想使用预训练的矢量。我编辑了代码来获取单词的向量而不是索引。以下是代码:

#This piece of code loads the vectors from a json file {'word':[vector]..}
class Lang:
    def __init__(self, name, savedVectorsFile):
        def getSavedVectors(filename):
            import json
            word2vec = {}
            with open(filename) as json_data:
                word2vec = json.load(json_data)
            return word2vec

        self.name = name
        self.word2vector = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

        self.get_saved_vector = getSavedVectors(savedVectorsFile)
        self.word2vector['unknown'] = self.get_saved_vector['unknown']

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2vector:
            self.word2vector[word] = self.get_saved_vector[word]
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1




# this piece deals with returning a vector of given word, all vectors are just concatenated into one giant vector
def vectorFromSentence(lang, sentence):
    vectors = []
    for word in sentence.split(' '):
        if word not in lang.word2vector:
            vectors += (lang.word2vector["unknown"])
        else:
            vectors += (lang.word2vector[word])
    return vectors



# in the train method, I am passing  a vector instead of index to the encoder
    for ei in range(0, input_length, VEC_SIZE):
        encoder_output, encoder_hidden = encoder(input_variable[ei*VEC_SIZE:(ei+1)*VEC_SIZE], encoder_hidden)
        encoder_outputs[ei] = encoder_output[0][0]

现在,我无法弄清楚如何更改编码器以合并矢量而不是索引。这是我现在的编码器:

def __init__(self, input_size, hidden_size, n_layers=1):
    super(EncoderRNN, self).__init__()
    self.n_layers = n_layers
    self.hidden_size = hidden_size

    self.embedding = nn.Embedding(input_size, hidden_size)
    self.gru = nn.GRU(hidden_size, hidden_size)

我收到了这个错误

TypeError: torch.index_select received an invalid combination of arguments - got (torch.cuda.FloatTensor, int, torch.cuda.FloatTensor), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

我也试过使用VEC_SIZE而不是input_size,但无济于事。

以下是追踪:

Traceback (most recent call last):
  File "scapula_generation_pretrained_vectors.py", line 710, in <module>
    trainIters(encoder1, attn_decoder1, 50000, print_every=1000)
  File "scapula_generation_pretrained_vectors.py", line 566, in trainIters
    decoder, encoder_optimizer, decoder_optimizer, criterion)
  File "scapula_generation_pretrained_vectors.py", line 472, in train
    encoder_output, encoder_hidden = encoder(input_variable[ei*VEC_SIZE:(ei+1)*VEC_SIZE], encoder_hidden)
  File "/home/sagar/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "scapula_generation_pretrained_vectors.py", line 214, in forward
    embedded = self.embedding(input).view(1, 1, -1)
  File "/home/sagar/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sagar/anaconda3/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 94, in forward
    self.scale_grad_by_freq, self.sparse
  File "/home/sagar/anaconda3/lib/python3.6/site-packages/torch/nn/_functions/thnn/sparse.py", line 53, in forward
    output = torch.index_select(weight, 0, indices.view(-1))

如何编写编码器和解码器来占用word2vec嵌入?涉及到一些问题,例如编码器gru的输出必须是大小为VEC_SIZE的向量,在解码器中 - 我的损失必须使用某种相似性度量来计算。我想我会选择余弦相似度,但在此之前我必须确保解码器获取编码器的输出并生成大小为VEC_SIZE的向量。

如果有人已经在教程中完成了这项功课并且有一个现成的代码,我将不胜感激?

0 个答案:

没有答案