In Tensorflow, is there an op / are there ops to accept a tensor (of filenames) and output images?

时间:2016-06-10 16:11:31

标签: python tensorflow

I'd like to be able to read in batches of images. However, these batches must be constructed by a python script (they cannot be placed into a file ahead of time for various reasons). What's the most efficient way, in tensorflow to do the following:

(1) Provided: A python list of B variable-length strings that point to images, all have the same size. B is the batch size. (2) For each string, load the image it corresponds to, and apply a random crop of 5% (the crop is random but the size of the crop is fixed) (3) Concatenate the images together into a tensor of size B x H x W x 3

If this is not possible, does anyone have any benchmarks / data on the efficiency loss of loading and preprocessing the images in python then placing them into a queue? I assume the net will run considerably faster if image loading / preprocessing is done internally on tensorflow.

2 个答案:

答案 0 :(得分:1)

这就是我理解你的问题的方法:

  • 你有一些图片
  • 您有一个函数sample_batch(),它返回一批大小为B的文件名
  • 您想要阅读与这些文件名对应的图像并对其进行预处理
  • 最后输出一批这些例子
input = tf.placeholder(tf.string, name='Input')
queue = tf.FIFOQueue(capacity, tf.string, [()], name='Queue')
enqueue_op = queue.enqueue_many(input)

reader = tf.WholeFileReader()
filename, content = reader.read(queue)
image = tf.image.decode_jpeg(content, channels=3)

# Preprocessing
image = tf.random_crop(image, [H, W, 3])
image = tf.to_float(image)
batch_image = tf.train.batch([image], batch_size=B, name='Batch')
output = inference(batch_image)

然后在会话中,您必须使用sample_batch()函数中的文件名运行入队操作:

with tf.Session() as sess:
  tf.train.start_queue_runners()
  for i in range(NUM_STEPS):
    batch_filenames = sample_batch()
    sess.run(enqueue_op, feed_dict={input: batch_filenames})
    sess.run(output)

答案 1 :(得分:0)

如果您将图像作为字节数组,则可以在图形中使用与此类似的内容:

 jpegs = tf.placeholder(tf.string, shape=(None))
 images = tf.map_fn(lambda jpeg : your_processing_fn(jpeg), jpegs,
                   dtype=tf.float32)
 logits = your_inference_model(images,labels)

其中your_processing_fn是一个接收jpeg张量字节的函数,解码,调整大小并裁剪它并返回H x W x 3的图像

您需要最新版本的tensorflow,因为map_fn不在0.8及以下。