将Dense应用于RNN层中的每个输出

时间:2017-09-27 14:02:21

标签: tensorflow keras

我正在尝试将以下Keras代码转换为纯Tensorflow,但我无法在双向RNN输出的每个时间步都添加密集层:

以下是有问题的Keras代码:

self.model = Sequential()
self.model.add(Bidirectional(LSTM(nr_out, return_sequences=True,
                                     dropout_W=dropout, dropout_U=dropout),
                                     input_shape=(max_length, nr_out)))
self.model.add(TimeDistributed(Dense(nr_out, activation='relu', init='he_normal')))
self.model.add(TimeDistributed(Dropout(0.2)))

这是初始张量流代码:

lstm_cell_fwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
lstm_cell_bwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
outputs, output_state_fw, output_state_bw  = rnn.static_bidirectional_rnn(lstm_cell_fwd, lstm_cell_bwd, inputs=sequence, dtype=tf.float64)

一般来说,如果我只想预测最后一个状态,我会做类似的事情:

logits = tf.matmul(outputs[-1], weights['out']) + biases['out']

在Tensorflow中表达TimeDistributed图层的最佳方法是什么?

1 个答案:

答案 0 :(得分:1)

尝试将您的单元格定义更新为:

lstm_cell_fwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
lstm_cell_fwd = rnn.DropoutWrapper(lstm_cell_fwd, input_keep_prob=dropout, output_keep_prob=dropout)  # if you want to keep dropout, which seems to be in your Keras model
lstm_cell_fwd = rnn.OutputProjectionWrapper(lstm_cell_fwd, nr_out)  # FC output layer
# Similarly for lstm_cell_bwd
outputs, output_state_fw, output_state_bw  = rnn.static_bidirectional_rnn(lstm_cell_fwd, lstm_cell_bwd, ...)

看起来你的Keras定义使用了dropout,所以我在这里添加了一个dropout图层。我相信来自Keras的dropout_W相当于TF中的input_keep_prob,来自Keras的dropout_U相当于TF中的output_keep_prob。对于dropout图层,您需要定义占位符:

dropout = tf.placeholder(tf.float32, [], name='dropout')

并在您运行网络时提供一些辍学概率进行培训,通常dropout=1.0进行验证,测试和使用网络。