Question

我正在使用tensorflow版本r0.11 我正在尝试在转发网中使用批量规范化（tf.contrib.layers.batch_norm（））。作为一个开端，我跟随了以下github issue中的讨论。似乎'is_training'，'reuse'和'updates_collections'标志仍然令人困惑（使用中），部分原因是缺乏良好的用例。但是，我的问题是，如果我添加批量规范图层，损失不会减少。

我在CIFAR中构建了结构后面的代码。我正在以多种方式运行它（用于培训）。我有一个用于训练的脚本（类似于cifar10_multigpu.py）和一个用于测试的脚本（类似于cifar10_eval.py）。

for i in xrange(all_flags.num_gpus): # Number of GPUs is 2
    with tf.device('/gpu:%d' % i):
        with tf.name_scope('%s_%d' % (all_flags.TOWER_NAME, i)) as scope:
            # Calculate the loss for one tower of the model. This  function
            # constructs the entire model but shares the variables across all
            # towers.
            loss = _tower_loss(inputs[i], labels[i], scope) 

            # Reuse variables for the next tower. This line makes it possible
            tf.get_variable_scope().reuse_variables()

            # More stuff happening like compute_gradients (inside gpus loop),
            # averaging gradients (outside gpus loop), applying them (outside
            # gpus loop)

推理/模型构建发生在函数_tower_loss（嵌套函数）中。（下面是函数的一个例子，实际上我使用了更多层和神经元）。

def inference(inputs): #(This gets called from _tower_loss())
    # conv1
    with tf.variable_scope('conv1') as scope:
        kernel = # define kernel
        conv = tf.nn.conv2d(inputs, kernel, strides=[1, 1, 1, 1], padding='SAME')
        biases = _variable_on_gpu('biases', [64], tf.constant_initializer(0.0))
        preactivation = tf.nn.bias_add(conv, biases)
        # ReLU.
        conv1 = tf.nn.relu(preactivation, name=scope.name)

    # pool1
    pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1],
                           strides=[1, 2, 2, 1], padding='SAME', name='pool1')

    # Similarly more conv+pool and then fcs and finally logits
    return logits

我想执行批量正规化。所以我在'_tower_loss'和'推理'函数中传递了一个额外的占位符输入参数。

def inference(inputs, is_training):
    # BN1
    with tf.variable_scope('norm0') as scope:
        # Note that I'm using the dafault for 'updates_collections'
        # which is None
        norm0 = tf.contrib.layers.batch_norm(inputs, is_training=is_training,
                scope=scope, reuse=None)

    # conv1
    with tf.variable_scope('conv1') as scope:
        kernel = # define kernel
        conv = tf.nn.conv2d(norm0, kernel, strides=[1, 1, 1, 1], padding='SAME')
        # Rest is same

我还在fc图层中添加了标准化层

在火车代码中，说明是这样的 ...

variable_averages = tf.train.ExponentialMovingAverage(0.9999, global_step)
variables_averages_op = variable_averages.apply(tf.trainable_variables())

train_op = tf.group(apply_gradient_op, variables_averages_op)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # Line A

...

sess.run([train_op, loss, update_ops],feed_dict={is_training: True}) # Line B

...

当没有批量标准化时，A行不存在，而在行B中，“update_ops”在会话中运行。

我所看到的是，当不使用批量标准化时，损失从aorund 6.5开始并持续降低到接近0，但是当我使用批量标准化时，损失在2或3百（小批量）迭代后不会减少卡在5.5左右。速度方面，我会说性能是一样的。我不确定是什么问题。我尝试了不同的学习率（我使用的是Adam优化器）没有效果。我不确定'variables_averages_op'和'update_ops'是否搞乱了。任何帮助将不胜感激。

使用张量流批量归一化失速减少

0 个答案: