这里有一个概念上的障碍。使用tf.nn.dynamic_rnn
创建LSTM。
示例:使用输入["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"]
和使用num_unrollings = 4
训练的下一个单词预测
如果我进行测试,发送一个展开(4个字长)的["The", "quick", "brown", "fox"]
,我希望输出为["jumps"]
。但是,输出也将是4个字长,因为输入和输出矢量形状必须匹配。因此,理想情况下,输出将为["quick", "brown", "fox", "jumps"]
。是的,要问的太多了,因为仅看到单词"quick"
之后,算法就无法返回"The"
。并且,在这种情况下,在看到所有4个输入字之后,我们无法及时返回并将第一个输出字更新为“快速”。我们只能及时预测。
问题1::输入X[0,1,2,3]
返回y[0,1,2,3]
。 y[0,1,2]
的预测如何?在模型仅看到y[0]
之后是否预测了X[0]
?我应该期待["man","delivery","fox","jumps"]
的第一个预测向量吗?
问题2:如何在测试中分批展开?如果我将其批处理为:
[["The", "quick", "brown", "fox"], ["jumps", "over", "the", "lazy"]]
我应该期望输出为:
["man","delivery","fox","jumps"],["high","the","fence","dog"]
?
如果我以以下方式批处理:
[["The", "quick", "brown", "fox"], ["quick", "brown", "fox", "jumps"],["brown", "fox", "jumps", "over"]]
我应该期望输出为:
["man","delivery","fox","jumps"],["delivery","fox","jumps","over"],["bag", "jumps", "over", "the"]
?
如何构造输入以获取(展平的)输出:
["quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"]
?
我应该串联输出y[3]
吗?
根据以下答案:Predicting the next word using the LSTM ptb model tensorflow example看来是这样。只是在寻找进一步的输入。