Question

我有张量流和基于tfrecord的输入管道的问题。

我的每条记录都包含：

＆＃39;图像＆＃39;尺寸为[480,585,5]的三维数组，数据类型为uint8
＆＃39;目标＆＃39;尺寸为[7,7,1,6]的4维数组，数据类型为float32

这是我用来从tfrecord读取数据并创建一个minibatch的代码示例：

def getBatch(filenames,num_examples):

    filename_queue = tf.train.string_input_producer(filenames,num_epochs=None)

    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)

    batch_of_example = tf.train.batch([serialized_example],num_examples, capacity=2500, num_threads=16)
    features = tf.parse_example(
        batch_of_example,
        features={
            'label': tf.FixedLenFeature([7,7,1,6], tf.float32),
            'image': tf.FixedLenFeature([], tf.string)
        }
    )

    image_raw = tf.reshape(tf.decode_raw(features['image'],tf.uint8),[num_examples,480,585,5])
    #rescale images for neural network
    images =tf.image.resize_images(image_raw,[224,224])

    return images,feature['label']

然而，当我尝试在训练例程中使用它时，性能非常糟糕（6/7示例第二个，在Titan X上使用相对较小的网络）并且cpu和gpu似乎都不在高工作负载下。

我使用的是训练集26 tfrecord文件，每个示例2500个（每个约3.5GB），批量大小为32。

我认为性能缓慢是由输入队列始终为空而导致的 this graph:

任何人都可以找到输入管道中存在问题的位置吗？或者任何人都可以给我一些关于我为什么会有这么糟糕表现的指导？

基于tfrecord的输入管道，队列总是空的

0 个答案: