我正在尝试将以下Keras代码转换为纯Tensorflow,但我无法在双向RNN输出的每个时间步都添加密集层:
以下是有问题的Keras代码:
self.model = Sequential()
self.model.add(Bidirectional(LSTM(nr_out, return_sequences=True,
dropout_W=dropout, dropout_U=dropout),
input_shape=(max_length, nr_out)))
self.model.add(TimeDistributed(Dense(nr_out, activation='relu', init='he_normal')))
self.model.add(TimeDistributed(Dropout(0.2)))
这是初始张量流代码:
lstm_cell_fwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
lstm_cell_bwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
outputs, output_state_fw, output_state_bw = rnn.static_bidirectional_rnn(lstm_cell_fwd, lstm_cell_bwd, inputs=sequence, dtype=tf.float64)
一般来说,如果我只想预测最后一个状态,我会做类似的事情:
logits = tf.matmul(outputs[-1], weights['out']) + biases['out']
在Tensorflow中表达TimeDistributed图层的最佳方法是什么?
答案 0 :(得分:1)
尝试将您的单元格定义更新为:
lstm_cell_fwd = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
lstm_cell_fwd = rnn.DropoutWrapper(lstm_cell_fwd, input_keep_prob=dropout, output_keep_prob=dropout) # if you want to keep dropout, which seems to be in your Keras model
lstm_cell_fwd = rnn.OutputProjectionWrapper(lstm_cell_fwd, nr_out) # FC output layer
# Similarly for lstm_cell_bwd
outputs, output_state_fw, output_state_bw = rnn.static_bidirectional_rnn(lstm_cell_fwd, lstm_cell_bwd, ...)
看起来你的Keras定义使用了dropout,所以我在这里添加了一个dropout图层。我相信来自Keras的dropout_W
相当于TF中的input_keep_prob
,来自Keras的dropout_U
相当于TF中的output_keep_prob
。对于dropout图层,您需要定义占位符:
dropout = tf.placeholder(tf.float32, [], name='dropout')
并在您运行网络时提供一些辍学概率进行培训,通常dropout=1.0
进行验证,测试和使用网络。