堆叠LSTM的初始状态结构

时间:2019-04-19 17:18:47

标签: tensorflow keras lstm recurrent-neural-network

使用tf.keras.layers.RNN API在TensorFlow(1.13.1)中的多层/堆叠RNN上的初始状态所需的结构是什么?

我尝试了以下操作:

lstm_cell_sizes = [256, 256, 256]
lstm_cells = [tf.keras.layers.LSTMCell(size) for size in lstm_cell_sizes]

state_init = [tf.placeholder(tf.float32, shape=[None] + cell.state_size) for cell in lstm_cells]

tf.keras.layers.RNN(lstm_cells, ...)(inputs, initial_state=state_init)

结果是:

ValueError: Could not pack sequence. Structure had 6 elements, but flat_sequence had 3 elements.  Structure: ([256, 256], [256, 256], [256, 256]), flat_sequence: [<tf.Tensor 'player/Placeholder:0' shape=(?, 256, 256) dtype=float32>, <tf.Tensor 'player/Placeholder_1:0' shape=(?, 256, 256) dtype=float32>, <tf.Tensor 'player/Placeholder_2:0' shape=(?, 256, 256) dtype=float32>].

如果我将state_init更改为形状为[None, 256]的张量的扁平列表,则会得到:

ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=[InputSpec(shape=(None, 256), ndim=2), InputSpec(shape=(None, 256), ndim=2), InputSpec(shape=(None, 256), ndim=2)]; however `cell.state_size` is [[256, 256], [256, 256], [256, 256]]

Tensorflow RNN docs在这方面含糊不清:

  

“您可以通过以下方式象征性地指定RNN图层的初始状态:   使用关键字参数initial_state来调用它们。的价值   initial_state应该是张量或代表的张量列表   RNN层的初始状态。”

1 个答案:

答案 0 :(得分:1)

我相信这是您在TF2中的做法:

import tensorflow.compat.v2 as tf #If you have a newer version of TF1
#import tensorflow as tf          #If you have TF2

sentence_max_length = 5
batch_size = 3
n_hidden = 2
x = tf.constant(np.reshape(np.arange(30),(batch_size,sentence_max_length, n_hidden)), dtype = tf.float32)

stacked_lstm = tf.keras.layers.StackedRNNCells([tf.keras.layers.LSTMCell(128) for _ in range(2)])

lstm_layer = tf.keras.layers.RNN(stacked_lstm,return_state=False,return_sequences=False)

result = lstm_layer(x)
print(result)