在使用小批量学习训练神经网络的过程中,应该为小批量中的每个样本生成一个新的辍学掩码,还是应该将相同的掩码应用于小批量中的所有样本?如果有人在递归神经网络中使用变差辍学,那么一切都会改变吗?据我所知,这意味着在所有时间步骤中都使用相同的掩码(输入,输出,状态)?
我之所以问是因为Tensorflow中的DropoutWrapper
。使用variational_recurrent=False
似乎对每个样本应用了蒙版,而variational_recurrent=True
似乎对每个小批量应用了蒙版?如何为这两种选择获得一种逻辑?
这里有一些代码可以重现上面的陈述:
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
tf.set_random_seed(0)
np.random.seed(0)
print("Tensorflow:", tf.__version__)
print("NumPy:", np.__version__)
variational_recurrent = False #True
n_input = 2
n_neurons = 1
output_keep_prob = 0.5
X0 = tf.placeholder(tf.float32, shape=(None, n_input))
X1 = tf.placeholder(tf.float32, shape=(None, n_input))
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons,
activation=tf.nn.tanh)
drop_cell = tf.nn.rnn_cell.DropoutWrapper(basic_cell,
output_keep_prob=output_keep_prob,
variational_recurrent=variational_recurrent,
dtype=tf.float32)
output_seqs, states = tf.nn.static_rnn(drop_cell,
[X0, X1],
dtype=tf.float32)
# get respective mask tensors
g = tf.get_default_graph()
if variational_recurrent:
mask_0 = g.get_tensor_by_name("rnn/Floor:0")
mask_1 = g.get_tensor_by_name("rnn/Floor_1:0")
else:
mask_0 = g.get_tensor_by_name("rnn/dropout/Floor:0")
mask_1 = g.get_tensor_by_name("rnn/dropout_1/Floor:0")
init = tf.global_variables_initializer()
n_batch = 10
v_X0 = np.random.randn(n_batch, n_input)
v_X1 = np.random.randn(n_batch, n_input)
with tf.Session() as sess:
sess.run(init)
v_mask_0, v_mask_1, v_output_seqs = \
sess.run([mask_0, mask_1, output_seqs], feed_dict={X0: v_X0, X1: v_X1})
print("use_new_dropout:", use_new_dropout)
print("mask_0:", v_mask_0.flatten())
print("mask_1:", v_mask_1.flatten())
print("output_seq_0:", v_output_seqs[0].flatten())
print("output_seq_1:", v_output_seqs[1].flatten())
例如,使用variational_recurrent = False
Tensorflow: 1.11.0
NumPy: 1.15.2
use_new_dropout: True
mask_0: [1. 0. 0. 1. 0. 0. 1. 0. 0. 0.]
mask_1: [1. 1. 1. 1. 1. 1. 1. 0. 0. 0.]
output_seq_0: [ 1.4711434 0. -0. 0.42718613 0. 0.
0.7002963 0. 0. -0. ]
output_seq_1: [-0.1192586 0.30452126 -0.7493641 -0.07260141 1.8993055 1.2405635
-1.8763325 0. 0. -0. ]
variational_recurrent = True
产生时:
use_new_dropout: True
mask_0: [0.]
mask_1: [0.]
output_seq_0: [ 0. -0. 0. 0. -0. -0. 0. -0. 0. 0.]
output_seq_1: [-0. 0. 0. 0. -0. -0. 0. -0. -0. -0.]