Question

我一直在尝试设计一个包装器，以便为自定义数据集使用预制的tensorflow slim模型。数据集是1000个正方形和三角形图像，32x32灰度。它们被组织为数据集/形状/三角形/和数据集/形状/正方形/。

使用以下代码，我能够无错误地训练inception_v2模型。稍后将使用正确的变量参数替换tf.reshape。 .tfrecords文件是使用谷歌的this脚本创建的，该脚本根据上述数据集结构创建记录。

graph = tf.Graph()
sess = tf.InteractiveSession(graph=graph)

with graph.as_default():
    name_dict, nClass = gen_dict(data_directory, path_to_labels_file)

    # associate the "label" and "image" objects with the corresponding features read from
    # a single example in the training data file
    label, image = getImage("datasets/shapes/train-00000-of-00001", height, width, nClass)

    # associate the "label_batch" and "image_batch" objects with a randomly selected batch---
    # of labels and images respectively
    imageBatch, labelBatch = tf.train.shuffle_batch(
        [image, label], batch_size=bsize,
        capacity=2000,
        min_after_dequeue=1000)

    with sess.as_default():
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)

        sess.run(tf.global_variables_initializer())

        batch_xs, batch_ys = sess.run([imageBatch, labelBatch])

        print('ran shuffle batch')
        print(tf.shape(batch_xs))
        print(tf.shape(batch_ys))
        # batch_xs = tf.expand_dims(batch_xs, 2)
        batch_xs = tf.reshape(batch_xs, [100, 32, 32, 1])
        print(tf.shape(batch_xs))
        logits, end_points = inception.inception_v2(batch_xs,
                                                    num_classes=2,
                                                    is_training=True)

        predictions = end_points['Predictions']
        logits = end_points['Logits']

        tf.losses.softmax_cross_entropy(batch_ys, logits)

        total_loss = slim.losses.get_total_loss()

        optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

        train_tensor = slim.learning.create_train_op(total_loss, optimizer)

        slim.learning.train(train_tensor,
                            train_log_dir,
                            number_of_steps=1000)

我遇到的问题是其他型号。使用inception_v1，使用相同的参数，我得到以下错误：

File "model_test.py", line 62, in <module>
    is_training=True)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/nets/inception_v1.py", line 349, in inception_v1
    net, [7, 7], stride=1, scope='MaxPool_0a_7x7')
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
    return func(*args, **current_args)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 131, in avg_pool2d
    outputs = layer.apply(inputs)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 492, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 441, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/layers/pooling.py", line 276, in call
    data_format=utils.convert_data_format(self.data_format, 4))
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 1741, in avg_pool
    name=name)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 48, in _avg_pool
    data_format=data_format, name=name)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2508, in create_op
    set_shapes_for_outputs(ret)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1873, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1823, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
    debug_python_shape_fn, require_shape_fn)
  File "/home/chakicherla3/tf_slim_image_classification/models/slim/python/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 7 from 1 for 'InceptionV1/Logits/MaxPool_0a_7x7/AvgPool' (op: 'AvgPool') with input shapes: [100,1,1,1024].

我使用inception_v3得到了类似的错误。使用vgg_16和vgg_19，我得到：

ValueError: Negative dimension size caused by subtracting 7 from 1 for 'vgg_16/fc6/convolution' (op: 'Conv2D') with input shapes: [100,1,1,512], [7,7,512,4096].

任何人都可以深入了解这些错误吗？ inception_v1和inception_v2之间可能会导致它崩溃的区别是什么，以及初始模型如何不同？我还没有尝试使用ResNet这个数据集，但我怀疑也会发生类似的错误。

作为参考，此示例代码基于tf slim文档提供的“工作示例”，位于here

它运行的系统使用带有Tensorflow-GPU 1.2.0的Python 2.7.10。它是一个Xeon系统，在Ubuntu 14.10上有4个Nvidia Titan X GPU。

谢谢！如果您需要任何其他系统配置或getImage功能，我也可以提供这些配置！

Answer 1

对于初始模型，输入尺寸为32x32的图像太小。 Inception_v1尝试使用内核大小为7x7的平均池，但是在此层的输入上，在所有先前的池层之后得到1x1数据（1024个通道）。

无论如何，我认为“开始”对于你描述的任务来说太大了。

TensorFlow Slim预训练模型负尺寸

1 个答案: