Question

我试图以一种显然不受支持的方式使用TimeDistributed和LSTM;有没有人遇到同样的问题？如果是这样，有没有办法绕过这个？

问题在于：

>>> from keras.layers import Highway, Input, TimeDistributed
>>> input1 = Input(shape=(3, 5))
>>> input2 = Input(shape=(1, 5))
>>> highway_layer = Highway(activation='relu', name='highway')
>>> distributed_highway_layer = TimeDistributed(highway_layer, name='distributed_highway')
>>> highway_input1 = distributed_highway_layer(input1)
>>> highway_input2 = distributed_highway_layer(input2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 494, in __call__
    self.assert_input_compatibility(x)
  File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 434, in assert_input_compatibility
    str(x_shape))
Exception: Input 0 is incompatible with layer distributed_highway: expected shape=(None, 3, 5), found shape=(None, 1, 5)

因为TimeDistributed层是在应用于第一个输入时构建的，所以它认为它具有特定的输入形状，但是当它应用于具有不同时间步长的第二个输入并且崩溃时，该假设失败。 / p>

你在哪里遇到这个问题？我正在尝试重新实现一个reading comprehension model，它在单词嵌入之上使用高速公路图层来处理问题和文本段落。所以我的问题张量的形状为(batch_size, num_question_words, embedding_dim)，而我的通道张量的形状为(batch_size, num_passage_words, embedding_dim)。我只想要一个应用于嵌入的高速公路层，我希望它成为问题和段落的相同的高速公路层。上面的代码似乎是实现这一点的一种自然方式（假设input1和input2实际上是前一层输出，它是我的问题和段落的嵌入词。但是，它不起作用。

我可以想到一个解决方法，那就是实例化两个独立的TimeDistributed对象，两者都使用相同的底层Highway层。因为TimeDistributed没有任何参数，这实际上有效，但它有点难看。如果仅将其应用于TimeDistributed，那么它就不会那么糟糕了，但是对于任何重复层都会遇到同样的问题：

>>> from keras.layers import LSTM, Input
>>> input1 = Input(shape=(3, 5))
>>> input2 = Input(shape=(1, 5))
>>> lstm = LSTM(10)
>>> lstm(input1)
>>> lstm(input2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 494, in __call__
    self.assert_input_compatibility(x)
  File "/home/mattg/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py", line 434, in assert_input_compatibility
    str(x_shape))
Exception: Input 0 is incompatible with layer lstm: expected shape=(None, 3, 5), found shape=(None, 1, 5)

我不知道解决这个问题的方法。有什么想法吗？

我发现最好的（未记录的）解决方案是用3替换上面输入形状中的1和None，这解决了一些问题，但是我需要计算input1中的单词和input2中单词之间的相似度矩阵，并且在尝试使用K.dot构造此矩阵时遇到错误，然后会形成(None, None, None)形状（(batch_size, num_question_words, num_passage_words)）。

Keras：在具有不同时间步长的输入上使用TimeDistributed或recurrent层

0 个答案: