Question

我已阅读CNN Tutorial on the TensorFlow，我正在尝试为我的项目使用相同的模型。现在的问题是数据读取。我有大约25000张图像用于培训，大约5000张用于测试和验证。文件是png格式，我可以读取它们并将它们转换为numpy.ndarray。

教程中的CNN示例使用队列从提供的文件列表中获取记录。我试图通过将我的图像重塑为一维数组并在其前面附加标签值来创建我自己的二进制文件。所以我的数据看起来像这样

[[1,12,34,24,53,...,105,234,102],
 [12,112,43,24,52,...,115,244,98],
....
]

上面数组的单行长度 22501 ，其中第一个元素是标签。

我将文件转储到使用pickle并尝试使用 tf.FixedLengthRecordReader从文件中读取demonstrated in example

我正在执行与 cifar10_input.py 中相同的操作来读取二进制文件并将它们放入记录对象。

现在，当我从文件中读取标签和图像值不同时。我可以理解这是因为pickle还在二进制文件中转储大括号和括号的额外信息，并且它们更改了固定长度的记录大小。

上面的示例使用文件名并将其传递给队列以获取文件，然后将队列传递给文件中的单个记录。

我想知道我是否可以将上面定义的numpy数组而不是文件名传递给某些阅读器，它可以从该数组而不是文件中逐个获取记录。

Answer 1

使用CNN示例代码使数据工作的最简单方法可能是制作read_cifar10()的修改版本并改为使用它：

写出包含numpy数组内容的二进制文件。
```
import numpy as np
images_and_labels_array = np.array([[...], ...],  # [[1,12,34,24,53,...,102],
                                                  #  [12,112,43,24,52,...,98],
                                                  #  ...]
                                   dtype=np.uint8)

images_and_labels_array.tofile("/tmp/images.bin")
```
此文件类似于CIFAR10数据文件中使用的格式。您可能希望生成多个文件以获得读取并行性。请注意，ndarray.tofile()以行主顺序写入二进制数据而没有其他元数据;对数组进行pickle将添加TensorFlow的解析例程无法理解的特定于Python的元数据。

编写一个read_cifar10()的修改版本来处理您的记录格式。

def read_my_data(filename_queue):

  class ImageRecord(object):
    pass
  result = ImageRecord()

  # Dimensions of the images in the dataset.
  label_bytes = 1
  # Set the following constants as appropriate.
  result.height = IMAGE_HEIGHT
  result.width = IMAGE_WIDTH
  result.depth = IMAGE_DEPTH
  image_bytes = result.height * result.width * result.depth
  # Every record consists of a label followed by the image, with a
  # fixed number of bytes for each.
  record_bytes = label_bytes + image_bytes

  assert record_bytes == 22501  # Based on your question.

  # Read a record, getting filenames from the filename_queue.  No
  # header or footer in the binary, so we leave header_bytes
  # and footer_bytes at their default of 0.
  reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
  result.key, value = reader.read(filename_queue)

  # Convert from a string to a vector of uint8 that is record_bytes long.
  record_bytes = tf.decode_raw(value, tf.uint8)

  # The first bytes represent the label, which we convert from uint8->int32.
  result.label = tf.cast(
      tf.slice(record_bytes, [0], [label_bytes]), tf.int32)

  # The remaining bytes after the label represent the image, which we reshape
  # from [depth * height * width] to [depth, height, width].
  depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
                           [result.depth, result.height, result.width])
  # Convert from [depth, height, width] to [height, width, depth].
  result.uint8image = tf.transpose(depth_major, [1, 2, 0])

  return result

修改distorted_inputs()以使用新数据集：

def distorted_inputs(data_dir, batch_size):
  """[...]"""
  filenames = ["/tmp/images.bin"]  # Or a list of filenames if you
                                   # generated multiple files in step 1.
  for f in filenames:
    if not gfile.Exists(f):
      raise ValueError('Failed to find file: ' + f)

  # Create a queue that produces the filenames to read.
  filename_queue = tf.train.string_input_producer(filenames)

  # Read examples from files in the filename queue.
  read_input = read_my_data(filename_queue)
  reshaped_image = tf.cast(read_input.uint8image, tf.float32)

  # [...] (Maybe modify other parameters in here depending on your problem.)

这是一个最小的步骤，给出了你的起点。使用TensorFlow ops进行PNG解码可能更有效，但这将是一个更大的变化。

Answer 2

在你的问题中，你特别问：

我想知道我是否可以将上面定义的numpy数组而不是文件名传递给某些阅读器，它可以从该数组而不是文件中逐个获取记录。

你可以直接将numpy数组提供给队列，但对cifar10_input.py代码的更改是对other answer建议的更具侵略性的更改。

和以前一样，我们假设你的问题中有以下数组：

import numpy as np
images_and_labels_array = np.array([[...], ...],  # [[1,12,34,24,53,...,102],
                                                  #  [12,112,43,24,52,...,98],
                                                  #  ...]
                                   dtype=np.uint8)

然后，您可以按如下方式定义包含整个数据的队列：

q = tf.FIFOQueue([tf.uint8, tf.uint8], shapes=[[], [22500]])
enqueue_op = q.enqueue_many([image_and_labels_array[:, 0], image_and_labels_array[:, 1:]])

...然后调用sess.run(enqueue_op)填充队列。

另一种更有效的方法是将提供记录到队列中，您可以从并行线程执行此操作（有关此方法的更多详细信息，请参阅this answer）：

# [With q as defined above.]
label_input = tf.placeholder(tf.uint8, shape=[])
image_input = tf.placeholder(tf.uint8, shape=[22500])

enqueue_single_from_feed_op = q.enqueue([label_input, image_input])

# Then, to enqueue a single example `i` from the array.
sess.run(enqueue_single_from_feed_op,
         feed_dict={label_input: image_and_labels_array[i, 0],
                    image_input: image_and_labels_array[i, 1:]})

或者，要一次将批次排队，这将更有效：

label_batch_input = tf.placeholder(tf.uint8, shape=[None])
image_batch_input = tf.placeholder(tf.uint8, shape=[None, 22500])

enqueue_batch_from_feed_op = q.enqueue([label_batch_input, image_batch_input])

# Then, to enqueue a batch examples `i` through `j-1` from the array.
sess.run(enqueue_single_from_feed_op,
         feed_dict={label_input: image_and_labels_array[i:j, 0],
                    image_input: image_and_labels_array[i:j, 1:]})

Answer 3

我想知道我是否可以传递上面定义的numpy数组一些读者的文件名，它可以逐个获取记录从该数组而不是文件。

tf.py_func，包装python函数并将其用作TensorFlow运算符，可能有所帮助。这是example。

但是，由于您已经提到过您的图片存储在png文件中，我认为最简单的解决方案是替换this：

reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
result.key, value = reader.read(filename_queue)

用这个：

result.key, value = tf.WholeFileReader().read(filename_queue))
value = tf.image.decode_jpeg(value)

将队列附加到tensorflow中的numpy数组以获取数据而不是文件？

3 个答案: