Question

我有一个用于预测市场情绪的tensorflow LSTM模型。我用最大序列长度150构建模型。（最大单词数）在做出预测时，我编写了如下代码：

batchSize = 32
maxSeqLength = 150

def getSentenceMatrix(sentence):
  arr = np.zeros([batchSize, maxSeqLength])
  sentenceMatrix = np.zeros([batchSize,maxSeqLength], dtype='int32')
  cleanedSentence = cleanSentences(sentence)
  cleanedSentence = ' '.join(cleanedSentence.split()[:150])
  split = cleanedSentence.split()
  for indexCounter,word in enumerate(split):
      try:
          sentenceMatrix[0,indexCounter] = wordsList.index(word)
      except ValueError:
          sentenceMatrix[0,indexCounter] = 399999 #Vector for unkown words
  return sentenceMatrix

input_text = "example data"
inputMatrix = getSentenceMatrix(input_text)

在代码中，我将输入文本截断为150个单词，并忽略了剩余数据。
cleanedSentence = ' '.join(cleanedSentence.split()[:150])

由于这个原因，我的预测是错误的。有人可以帮我解决这个问题吗？

使用LSTM模型进行预测时出现最大序列长度问题

0 个答案: