Question

我正在Kaggle实施CNN for Digits Recognizer。结构是：

conv5x5（filters = 32）-conv5x5（filters = 32）-maxpool2x2-conv3x3（filters = 64）-conv3x3（filters = 64）-maxpool2x2-FC（512）-drop（keep prob = 0.25）-softmax（ 10）

此结构在Digits Recognizer中的准确度为99.728％。

我想在conv层中添加批量规范。我这样添加它们：

#Forward propagation of the whole CNN#
def forward_propagation(X, keep_prob_l5, BN_is_training, conv_params, convstride1_shape, convstride2_shape, pool2_shape, poolstride2_shape, convstride3_shape, convstride4_shape, pool4_shape, poolstride4_shape, n_5, n_out):
    W1 = conv_params['W1']
    b1 = conv_params['b1']
    W2 = conv_params['W2']
    b2 = conv_params['b2']
    W3 = conv_params['W3']
    b3 = conv_params['b3']
    W4 = conv_params['W4']
    b4 = conv_params['b4']
    Z1 = tf.nn.bias_add(tf.nn.conv2d(X, W1, strides=convstride1_shape, padding='SAME'), b1, data_format='NHWC')
    Z1_bachnorm = tf.contrib.layers.batch_norm(Z1, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
    A1 = tf.nn.relu(Z1_bachnorm)
    Z2 = tf.nn.bias_add(tf.nn.conv2d(A1, W2, strides=convstride2_shape, padding='SAME'), b2, data_format='NHWC')
    Z2_bachnorm = tf.contrib.layers.batch_norm(Z2, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
    A2 = tf.nn.relu(Z2_bachnorm)
    P2 = tf.nn.max_pool(A2, ksize=poolstride2_shape, strides=poolstride2_shape, padding='SAME')
    Z3 = tf.nn.bias_add(tf.nn.conv2d(P2, W3, strides=convstride3_shape, padding='SAME'), b3, data_format='NHWC')
    Z3_bachnorm = tf.contrib.layers.batch_norm(Z3, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
    A3 = tf.nn.relu(Z3_bachnorm)
    Z4 = tf.nn.bias_add(tf.nn.conv2d(A3, W4, strides=convstride4_shape, padding='SAME'), b4, data_format='NHWC')
    Z4_bachnorm = tf.contrib.layers.batch_norm(Z4, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
    A4 = tf.nn.relu(Z4_bachnorm)
    P4 = tf.nn.max_pool(A4, ksize=poolstride4_shape, strides=poolstride4_shape, padding='SAME')
    P4_flatten = tf.contrib.layers.flatten(P4)
    A5 = tf.contrib.layers.fully_connected(P4_flatten, n_5, activation_fn=tf.nn.relu)
    A5_drop = tf.nn.dropout(A5, keep_prob_l5)
    Z_out = tf.contrib.layers.fully_connected(A5_drop, n_out, activation_fn=None)
    return tf.transpose(Z_out)

BN_is_training在培训时为True的占位符，在推理时为False。

update_ops的设置如下：

#Define the optimization method#
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
    optimizer = tf.train.AdamOptimizer(learning_rate=decayed_learning_rate).minimize(cost)

然而，结果真的很奇怪。精度永远不会增加，成本也会不断增加。我是否在设置批量规范时犯了什么错误？

谢谢：D

我在Tensorflow中的批量规范化出了什么问题？

0 个答案: