如何在TensorFlow中的堆叠LSTM单元之间添加dropout图层?

时间:2017-03-21 09:29:13

标签: tensorflow lstm recurrent-neural-network

我可以创建一个具有两层LSTM的RNN网络,如下所示:

lstm_cell1 = tf.nn.rnn_cell.BasicLSTMCell(50)
lstm_cell2 = tf.nn.rnn_cell.BasicLSTMCell(100)
lstm_net = tf.nn.rnn_cell.MultiRNNCell([lstm_cell1, lstm_cell2])

但是现在我还希望在每个lstm单元格后面包含dropout图层。 像,

tf.nn.rnn_cell.MultiRNNCell([tf.nn.dropout(lstm_cell1, 0.8), tf.nn.dropout(lstm_cell2, 0.8)])

我如何实现这一目标?

2 个答案:

答案 0 :(得分:6)

lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units)    
lstm_dropout = tf.contrib.rnn.DropoutWrapper(lstm_cell,input_keep_prob=keep_prob, output_keep_prob=keep_prob)
lstm_layers = tf.contrib.rnn.MultiRNNCell([lstm_dropout]* 2)

答案 1 :(得分:3)

以下是堆叠LSTM丢失的代码。 Mashood Tanveer的答案已经足够好了,但我想补充一点,对于MultiRNNCell,你最好不要使用[cell]*num_layer。这是因为[cell]*num_layer会将一个LSTM实例堆叠到一个列表中,这可能导致维度不匹配。除非你知道它将输出什么尺寸,否则我建议你使用这样的代码。

[tf.contrib.rnn.BasicLSTMCell(hidden_size) for _ in range(num_layers)]

fw_lstms = []
for _ in range(num_layers):
    cell = tf.contrib.rnn.BasicLSTMCell(hidden_size) 
    cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=1-drop_prob)
    fw_lstms.append(cell)

bw_lstms = []
for _ in range(num_layers):
    cell = tf.contrib.rnn.BasicLSTMCell(hidden_size) 
    cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=1-drop_prob, 
                                         input_keep_prob=1-drop_prob)
    bw_lstms.append(cell)    
#bw_lstms = [tf.contrib.rnn.BasicLSTMCell(hidden_size) for _ in range(num_layers)]

fw_init_state_ls = [lstm.zero_state(batch_size, tf.float32) for lstm in fw_lstms]
bw_init_state_ls = [lstm.zero_state(batch_size, tf.float32) for lstm in bw_lstms]

outputs, final_states_fw, final_states_bw = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw = fw_lstms, cells_bw = bw_lstms,
   inputs = inputs,
   initial_states_fw = fw_init_state_ls, 
   initial_states_bw = bw_init_state_ls)
bi_final_state = tf.concat([final_states_fw[-1][1], final_states_bw[-1][1]], 1)