使用LSTM模型进行预测时出现最大序列长度问题

时间:2019-01-29 12:46:53

标签: tensorflow deep-learning nlp lstm

我有一个用于预测市场情绪的tensorflow LSTM模型。我用最大序列长度150构建模型。(最大单词数) 在做出预测时,我编写了如下代码:

batchSize = 32
maxSeqLength = 150

def getSentenceMatrix(sentence):
  arr = np.zeros([batchSize, maxSeqLength])
  sentenceMatrix = np.zeros([batchSize,maxSeqLength], dtype='int32')
  cleanedSentence = cleanSentences(sentence)
  cleanedSentence = ' '.join(cleanedSentence.split()[:150])
  split = cleanedSentence.split()
  for indexCounter,word in enumerate(split):
      try:
          sentenceMatrix[0,indexCounter] = wordsList.index(word)
      except ValueError:
          sentenceMatrix[0,indexCounter] = 399999 #Vector for unkown words
  return sentenceMatrix

input_text = "example data"
inputMatrix = getSentenceMatrix(input_text)

在代码中,我将输入文本截断为150个单词,并忽略了剩余数据。
cleanedSentence = ' '.join(cleanedSentence.split()[:150])

由于这个原因,我的预测是错误的。有人可以帮我解决这个问题吗?

0 个答案:

没有答案