在keras或tensorflow中定义多个不同的lstm

时间:2018-08-18 02:56:06

标签: python tensorflow keras lstm

我目前在lstm和rnn玩了一段时间。我已经在tensorflow和keras中尝试过它们。但是有些事情让我真的很困惑。就像在tensorflow中一样,如果我想在for循环中将多个rnn定义为解码器,则可以编写如下代码:

with tf.variable_scope("decoder-rnn") as vs:
    # We use an LSTM Cell

    cell_utterance = tf.nn.rnn_cell.LSTMCell(hparams.rnn_dim,
                                             forget_bias=2.0,
                                             use_peepholes=True,
                                             state_is_tuple=True)


    # Run all utterances through the RNN batch by batch
    # TODO: Needs to be parallelized
    all_utterances_encoded = []
    for i in range(batch_size):


        temp_outputs, temp_states = tf.nn.dynamic_rnn(cell_utterance, utterances_embedded[:,i],
                                                      utterances_len[i], dtype=tf.float32)
        all_utterances_encoded.append(temp_states[1]) # since it's a tuple, use the hidden states

    all_utterances_encoded = tf.stack(all_utterances_encoded, axis=0)

但是,似乎我们使用多个rnn包装相同的lstm。有什么方法可以使用不同的rnn包装不同的lstm?

同一个问题与keras有关,当我使用for循环定义十个GRU时,打印输出显示似乎只有两个不同的GRU,有人可以给出一些提示吗?谢谢。

for i in range(10):
gru = GRU(NUM_FILTERS,recurrent_activation='sigmoid',activation=None,return_sequences=False)#(embed)
print(gru)

<keras.layers.recurrent.GRU object at 0x7fcc1fef82b0>
<keras.layers.recurrent.GRU object at 0x7fcc1fef84a8>
<keras.layers.recurrent.GRU object at 0x7fcc1fef82b0>
<keras.layers.recurrent.GRU object at 0x7fcc1fef84a8>
<keras.layers.recurrent.GRU object at 0x7fcc1fef82b0>
<keras.layers.recurrent.GRU object at 0x7fcc1fef84a8>
<keras.layers.recurrent.GRU object at 0x7fcc1fef82b0>
<keras.layers.recurrent.GRU object at 0x7fcc1fef84a8>
<keras.layers.recurrent.GRU object at 0x7fcc1fef82b0>
<keras.layers.recurrent.GRU object at 0x7fcc1fef84a8>

1 个答案:

答案 0 :(得分:0)

由于创建后不使用GRU层,并且唯一引用是gru变量,因此在下一次for循环迭代中创建下一个GRU层后,该变量将被销毁。那是因为不再有对其的引用,因此将其视为垃圾。您可以将它们全部存储在列表中以防止发生这种情况:

gru_layers = []
for i in range(10):
    gru = GRU(NUM_FILTERS,recurrent_activation='sigmoid',activation=None,return_sequences=False)
    gru_layers.append(gru)
    print(gru)

示例输出:

<keras.layers.recurrent.GRU object at 0x7f8940398f98>
<keras.layers.recurrent.GRU object at 0x7f8940398ef0>
<keras.layers.recurrent.GRU object at 0x7f8940398400>
<keras.layers.recurrent.GRU object at 0x7f8940398978>
<keras.layers.recurrent.GRU object at 0x7f8940398a90>
<keras.layers.recurrent.GRU object at 0x7f89403b6048>
<keras.layers.recurrent.GRU object at 0x7f89403b6320>
<keras.layers.recurrent.GRU object at 0x7f89403b6278>
<keras.layers.recurrent.GRU object at 0x7f89403b6710>
<keras.layers.recurrent.GRU object at 0x7f89403b69b0>

或更妙的是,如果您打算将多个GRU层彼此堆叠,则可以执行以下操作:

embed = Embedding(vocab_size, embed_dim)
prev_layer = embed
for i in range(10):
    gru = GRU(NUM_FILTERS,recurrent_activation='sigmoid',activation=None,return_sequences=True)(prev_layer)
    prev_layer = gru

别忘了,当堆叠RNN层时,您应该传递return_sequences=True以获取所有时间步长的输出(最后一层除外)。