LSTM遵循均值池

时间:2016-04-05 13:50:54

标签: machine-learning neural-network deep-learning keras recurrent-neural-network

我正在使用Keras 1.0。我的问题与这个问题完全相同(How to implement a Mean Pooling layer in Keras),但对我来说这个答案似乎不够。

我想实现这个网络: enter image description here

以下代码不起作用:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
pool = AveragePooling1D()(lstm)
output = Dense(1, activation='sigmoid')(pool)

如果我未设置return_sequences=True,则在致电AveragePooling1D()时会收到此错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/PATH/keras/engine/topology.py", line 462, in __call__
    self.assert_input_compatibility(x)
  File "/PATH/keras/engine/topology.py", line 382, in assert_input_compatibility
    str(K.ndim(x)))
Exception: ('Input 0 is incompatible with layer averagepooling1d_6: expected ndim=3', ' found ndim=2')

否则,当我致电Dense()时,我收到此错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/PATH/keras/engine/topology.py", line 456, in __call__
    self.build(input_shapes[0])
  File "/fs/clip-arqat/mossaab/trec/liveqa/cmu/venv/lib/python2.7/site-packages/keras/layers/core.py", line 512, in build
    assert len(input_shape) == 2
AssertionError

6 个答案:

答案 0 :(得分:4)

请尝试这个(我希望这会解决你的问题:)):

http://keras.io/layers/core/#timedistributedmerge

答案 1 :(得分:4)

我只是尝试实现与原始海报相同的模型,而我正在使用Keras 2.0.3。当我使用GlobalAveragePooling1D时LSTM工作后的平均池,只需确保LSTM层中的return_sequences=True。试一试!

答案 2 :(得分:3)

我认为接受的答案基本上是错误的。找到了一个解决方案: https://github.com/fchollet/keras/issues/2151 但是,它只适用于theano后端。我修改了代码,以便它支持theano和tensorflow。

def board():

    gb = [['Planet Number', 'CivLevel', 'Success%', 'Fuel', 'Rocks', '']]
    for row in gb:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb0 = [['Planet 0', '4', '80%', '10', '10', '']]
    for row in gb0:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb1 = [['Planet 1', '4', '80%', '10', '10', '']]
    for row in gb1:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb2 = [['Planet 2', '4', '80%', '10', '10', '']]
    for row in gb2:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb3 = [['Planet 3', '4', '80%', '10', '10','']]
    for row in gb3:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb4 = [['Planet 4', '4', '80%', '10', '10', '']]
    for row in gb4:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb5 = [['Planet 5', '4', '80%', '10', '10', '']]
    for row in gb5:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb6 = [['Planet 6', '4', '80%', '10', '10', '']]
    for row in gb6:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    gb7 = [['Planet 7', '4', '80%', '10', '10', '']]
    for row in gb7:
        print("{: >20} {: >20} {: >20} {: >20} {: >20} {: >20}".format(*row))

    print("")

    if place == 0
        gb0[0][6] = "<-- You are here"
        return gb0[0][6]

    print ("")

答案 3 :(得分:2)

添加TimeDistributed(Dense(1))有帮助:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)

答案 4 :(得分:0)

谢谢,我也遇到了这个问题,但我认为TimeDistributed图层不能正常工作,你可以试试Luke Guye的TemporalMeanPooling图层,它适用于我。这是一个例子:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)

答案 5 :(得分:0)

参加聚会很晚,但是带有合适的tf.keras.layers.AveragePooling1D参数的pool_size也似乎返回了正确的结果。

正在bobchennanissue共享的示例上工作。

# create sample data
A=np.array([[1,2,3],[4,5,6],[0,0,0],[0,0,0],[0,0,0]])
B=np.array([[1,3,0],[4,0,0],[0,0,1],[0,0,0],[0,0,0]])
C=np.array([A,B]).astype("float32")
# expected answer (for temporal mean)
np.mean(C, axis=1)

输出为

array([[1. , 1.4, 1.8],
       [1. , 0.6, 0.2]], dtype=float32)

现在使用AveragePooling1D

model = keras.models.Sequential(
        tf.keras.layers.AveragePooling1D(pool_size=5)
)
model.predict(C)

输出为

array([[[1. , 1.4, 1.8]],
       [[1. , 0.6, 0.2]]], dtype=float32)

需要考虑的几点,

  • pool_size应该等于循环层的步长/时间步长。
  • 输出的形状为(batch_size, downsampled_steps, features),其中包含一个额外的downsampled_steps维度。如果您将pool_size设置为等于递归层中的时间步长,则该值始终为1。