使用lstm词级翻译在机器seq2seq翻译中将重复的单词作为翻译文本

时间:2020-06-09 19:35:09

标签: python tensorflow keras nlp seq2seq

对于输入翻译模型,我得到一个单词重复。 我是ML的新手,正在尝试学习NLP。我无法诊断问题。请帮忙

翻译功能代码:

def translate_seq(input_word):
  input_seq = input_tokenizer.texts_to_sequences([input_word])
  input_seq = pad_sequences(input_seq, maxlen=max_input_length, padding='pre')

  # predicting encoder sate values
  states_value = encoder_model.predict(input_seq)

  #creating an empty array for output sequence
  target_seq = np.zeros((1,1))
  target_seq[0,0] = word2idx_outputs['sos']
  print(target_seq)
  eos = word2idx_outputs['eos']
  print(eos)

  output_sentence = []

  for i in range(max_output_length):
    output_tokens, h, c = decoder_model.predict([target_seq] + states_value)
    idx = np.argmax(output_tokens[0,0,:])  # since we know the output shape:(number of inputs, length of the output sentence, the number of words in the output)

    if eos == idx:
      break

    word=''

    if idx > 0:
      word = idx2word_target[idx]
      output_sentence.append(word)

      target_seq[0,0] = idx
      state_values = [h,c]

  return ' '.join(output_sentence)

feed_text = 'you are not fired'
response = translate_seq(feed_text)
print("input sentence: ", feed_text)
print('translates sentence: ', response)

输入句子:您没有被解雇
翻译句子:navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous navonsnous

这是指向Google colab笔记本https://colab.research.google.com/gist/somvirs57/a1c25ed93ebee04a162282f7a8250f47/copy-of-untitled1.ipynb

的链接

0 个答案:

没有答案