我已经用Keras编写了一个可以正常工作的架构,但是我想在tensorflow中实现相同的架构。我正在使用Tensorflow编写体系结构,但无法创建多个LSTM层。
这是keras代码:
input_data1 = Input(inshape, dtype='float32', name='input1')
encoder1 = TimeDistributed(Dense(128, activation='relu', name='encoder1'), name='encoder1_TD')(input_data1)
lstm1 = LSTM(256, return_sequences=True, name='lstm1')(encoder1)
lstm2 = LSTM(256, return_sequences=True, name='lstm2')(lstm1)
intermediate_data = TimeDistributed(Dense(128, activation='linear', name='decoder1'), name='decoder_TD1')(lstm2)
output_data = TimeDistributed(Dense(12, activation='linear', name='decoder2'), name='decoder_TD2')(intermediate_data)
model = Model(input_data, output_data)
print(model.summary())
return model
有人可以帮我解决这个问题吗?我无法理解MultiRNNCell的用法。每当我使用两层或更多层LSTM时,都会给我一个错误。
input shape = (batch_size, timesteps, 4)
output shape = (batch_size, timesteps, 8)
答案 0 :(得分:0)
似乎in tensorflow API的最新变化使其与keras相似,而且新的教程侧重于类似于keras的解决方案。
如果您需要一些“旧式”的堆叠张量流LSTM,则可以使用tf.nn.rnn_cell.MultiRNNCell
(现在已弃用并替换为tf.keras.layers.StackedRNNCells
):
import tensorflow as tf
import tensorflow.contrib
from tensorflow.nn import dynamic_rnn
input_data = tf.placeholder(tf.float32, [batch_size, time_steps, num_features])
label_data = tf.placeholder(tf.float32, [batch_size, time_steps, num_labels])
# dense layer is broadcasted automatically to time-distributed data
dense_data = tf.layers.dense(input_data, 128, activation='relu')
with tf.variable_scope('lstm') as scope:
lstm1 = tensorflow.nn.rnn_cell.LSTMCell(256, state_is_tuple=True)
lstm2 = tensorflow.nn.rnn_cell.LSTMCell(256, state_is_tuple=True)
lstm3 = tensorflow.nn.rnn_cell.LSTMCell(256, state_is_tuple=True)
# or even more layers
# group them into one cell
multi_cell = tensorflow.nn.rnn_cell.MultiRNNCell(cells=[lstm1, lstm2, lstm3], state_is_tuple=True)
rnn_result, _ = tf.nn.dynamic_rnn(multi_cell, dense_data, dtype=tf.float32)
td_data_1 = tf.layers.dense(rnn_result, 128, activation='linear')
td_data_2 = tf.layers.dense(rnn_result, 12, activation='linear')
现在您应该定义一些损失,但是尚不清楚您打算使用哪种损失,因此我省略了这一部分(基于某些功能,它可能是sigmoid_cross_entropy_with_logits
(毕竟,这不是可运行的示例) ,但如果需要,我可以为一些常规数据集(例如MNIST)提供一个):
loss = tf.nn...
train_op = tf.train.AdamOptimizer(1e-4).minimize(loss)
init_op = tf.initialize_all_variables()
初始化和训练(简化):
sess = tf.Session()
sess.run(init_op)
for i in range(len(x_tr) // batch_size):
sess.run(
train_op,
feed_dict={
input_data: x_tr[i*batch_size:i*batch_size+batch_size],
output_data: y_tr_cat[i*batch_size:i*batch_size+batch_size],
}
)
请注意,如果您不需要已弃用的tensorflow层并需要一些2.0版本的解决方案。