为什么我不能使用卷积层1x1

时间:2016-12-22 10:58:09

标签: image-processing tensorflow deep-learning convolution

我试图修改tensorflow slim overfeat网络来分类小图像类,图像大小为60 * 60和3类。 我在使用TITAN X GPU的Ubuntu 14.04上使用tensorflow v0.12。

我的第一个网络是



    import tensorflow as tf

    slim = tf.contrib.slim
    trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


    def overfeat_arg_scope(weight_decay=0.0005):
      with slim.arg_scope(
          [slim.conv2d, slim.fully_connected],
          activation_fn=tf.nn.relu,
          weights_regularizer=slim.l2_regularizer(weight_decay),
          biases_initializer=tf.constant_initializer()):
        with slim.arg_scope([slim.conv2d], padding='SAME'):
          with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc:
            return arg_sc


    def overfeat(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.5,
                 spatial_squeeze=False,
                 reuse=None,
                 scope='overfeat'):
      with tf.variable_scope(scope, 'overfeat', [inputs], reuse=reuse) as sc:
        end_points_collection = sc.name + '_end_points'
        # Collect outputs for conv2d, fully_connected and max_pool2d
        with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d],
                            outputs_collections=end_points_collection):
          net = slim.conv2d(inputs, 64, 3, padding='VALID',
                            scope='conv11')
          net = slim.conv2d(inputs, 128, 3, padding='VALID',
                            scope='conv12')
          net = slim.max_pool2d(net, 2, scope='pool1')
          net = slim.conv2d(net, 128, 3, padding='VALID', scope='conv2')
          net = slim.max_pool2d(net, 2, scope='pool2')
          net = slim.conv2d(net, 256, 3, scope='conv3')
          net = slim.conv2d(net, 256, 3, scope='conv4')
          net = slim.conv2d(net, 256, 3, scope='conv5')
          net = slim.max_pool2d(net, 2, scope='pool5')
          with slim.arg_scope([slim.conv2d],
                              weights_initializer=trunc_normal(0.005),
                              biases_initializer=tf.constant_initializer(0.1)):
            # Use conv2d instead of fully_connected layers.
            net = slim.conv2d(net, 512, 3, padding='VALID', scope='fc6')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                               scope='dropout6')
            net = slim.conv2d(net, 1024, 1, scope='fc7')
            with tf.variable_scope('Logits'):
                #pylint: disable=no-member
                if is_training:
                    net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',
                                      scope='AvgPool_1a_8x8')
                net = slim.conv2d(
                    net,
                    num_classes, 1,
                    activation_fn=None,
                    normalizer_fn=None,
                    biases_initializer=tf.constant_initializer(),
                    scope='fc9')

                net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                                   scope='Dropout')
          # Convert end_points_collection into a end_point dict.
          end_points = slim.utils.convert_collection_to_dict(end_points_collection)
          if spatial_squeeze:
            net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
            end_points[sc.name + '/fc8'] = net
          return net, end_points

    def inference(images, num_classes, keep_probability, phase_train=True, weight_decay=0.0, reuse=None):
        batch_norm_params = {
            # Decay for the moving averages.
            'decay': 0.995,
            # epsilon to prevent 0s in variance.
            'epsilon': 0.001,
            # force in-place updates of mean and variance estimates
            'updates_collections': None,
        }
        with slim.arg_scope(overfeat_arg_scope()):
            return overfeat(images, num_classes, is_training=phase_train,
                  dropout_keep_prob=keep_probability, reuse=reuse)

我使用tf.nn.sparse_softmax_cross_entropy_with_logits函数进行交叉修复损失。

并且训练结果是 Loss And Accuracy with one 1x Conv

这个结果是可以接受的。 我试图在fc7之后添加一个1x1转换因为我认为1x1转换是相同的完全连接层,可能会提高准确性。


        ...
        net = slim.conv2d(net, 1024, 1, scope='fc7')
        net = slim.conv2d(net, 1024, 1, scope='fc7_1')
        ...

但结果不可靠: Loss And Accuracy with two 1x1 Conv

此网络不会因损失1而进行优化。

为什么我无法添加更多1x1 conv或fc图层?

我该如何改善这个网络?

2 个答案:

答案 0 :(得分:1)

(1,1)卷积层不是完全连接的层。如果要将完全连接的层实现为卷积层,则应先添加该层的最后一层内核大小。

(如果之前图层的要素图是50x50,那么最后一层应该有50 x 50的内核)。具有(1,1)内核大小的卷积层与ro mlp层类似。如果tyou想要取消其作用,请阅读本文Network in Network

如果我理解得很好,你想要完全连接层。所以你必须做两件事:

  • 通过使用带有等于输出类的通道的卷积层(1,1),确保将最后一层缩小到类的大小。
  • 使用全局平均池来将要素图缩减为1,然后将结果提供给softmax。

答案 1 :(得分:0)

1x1卷积与完全连接的层不同。单独的参数计数差别很大。 1x1卷积是所有先前滤波器输出中所有像素的加权和,它们位于图像中的相同位置。

另一方面,完全连接的图层会考虑当前图层中每个新像素的所有滤镜的所有像素。

您应该查看http://deeplearning.net/tutorial/contents.html以更好地理解卷积。

对于最终图层,完全连接的图层用于组合先前图层中提取的要素,并将它们组合到最终输出。