如何在LSTM中实现Tensorflow批量规范化

时间:2017-10-24 16:13:38

标签: python tensorflow neural-network lstm rnn

我目前的LSTM网络看起来像这样。

rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)  # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,              # cell you have chosen
    tf_x,                  # input
    initial_state=init_s,  # the initial hidden state
    time_major=False,      # False: (batch, time step, input); True: (time step, batch, input)
)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(outputs, [-1, CELL_SIZE])
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

通常,我将tf.layers.batch_normalization应用为批量规范化。但我不确定这是否适用于LSTM网络。

b1 = tf.layers.batch_normalization(outputs, momentum=0.4, training=True)
d1 = tf.layers.dropout(b1, rate=0.4, training=True)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(d1, [-1, CELL_SIZE])                       
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

2 个答案:

答案 0 :(得分:1)

如果要对RNN(LSTM或GRU)使用批处理规范,则可以签出this implementation或从blog post阅读完整说明。

但是,在序列数据中,层规范化比批处理规范更具优势。具体来说,“批处理规范化的效果取决于微型批处理的大小,将其应用于循环网络尚不明显”(摘自Ba, et al. Layer normalization)。

对于层归一化,它对每个层内的求和输入进行归一化。您可以检出implementation的GRU单元的层标准化:

答案 1 :(得分:0)

基于此paper“图层规范化”-吉米·雷巴(Jimmy Lei Ba),杰米·瑞安·基洛斯(Jamie Ryan Kiros),杰弗里·E·欣顿(Geoffrey E. Hinton)

Tensorflow现在带有tf.contrib.rnn.LayerNormBasicLSTMCell LSTM单元,该单元具有图层归一化和递归缺失。

找到文档here