Question

我已经构建了一个模型来训练TensorFlow中的卷积自动编码器。我按照instructions on Reading Data from the TF documentation阅读我自己的尺寸为233 x 233 x 3的图像。这是我的convert_to（）函数改编自这些说明：

def convert_to(images, name):
  """Converts a dataset to tfrecords."""
  num_examples = images.shape[0]
  rows = images.shape[1]
  cols = images.shape[2]
  depth = images.shape[3]

  filename = os.path.join(FLAGS.tmp_dir, name + '.tfrecords')
  print('Writing', filename)
  writer = tf.python_io.TFRecordWriter(filename)
  for index in range(num_examples):
    print(images[index].size)
    image_raw = images[index].tostring()
    print(len(image_raw))
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(rows),
        'width': _int64_feature(cols),
        'depth': _int64_feature(depth),
        'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())
  writer.close()

当我在for循环开始时打印图像的大小时，大小为162867，但是当我在.tostring（）行之后打印时，大小为1302936.这会导致问题，因为模型我认为我的输入是应该的8倍。是否更好地更改＆＃39; image_raw＆＃39;在示例中输入_int64_feature（image_raw）或更改我将其转换为字符串的方式？

或者，问题可能出在我的read_and_decode（）函数中，例如字符串没有正确解码或示例没有被解析...？

def read_and_decode(self, filename_queue):
    reader = tf.TFRecordReader()

    _, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            'height': tf.FixedLenFeature([], tf.int64),
            'width': tf.FixedLenFeature([], tf.int64),
            'depth': tf.FixedLenFeature([], tf.int64),
            'image_raw': tf.FixedLenFeature([], tf.string)
      })

    # Convert from a scalar string tensor to a uint8 tensor
    image = tf.decode_raw(features['image_raw'], tf.uint8)

    # Reshape into a 233 x 233 x 3 image and apply distortions
    image = tf.reshape(image, (self.input_rows, self.input_cols, self.num_filters))

    image = data_sets.normalize(image)
    image = data_sets.apply_augmentation(image)

    return image

谢谢！

Answer 1

对于您的问题，我可能有一些答案。

首先，在.tostring()方法之后，图像长8倍是正常的，这是完全正常的。后者将您的数组转换为字节。它的名字很不好，因为在python 3中一个字节不同于一个字符串（但是在python 2中它们是相同的）。默认情况下，我猜您的图像是在int64中定义的，因此每个元素将使用8个字节（或64位）进行编码。在您的示例中，图像的162867像素编码为1302936字节...

关于解析过程中的错误，我认为这是由于您将数据写入int64（以64位编码的整数，所以8个字节）中写入数据，然后以uint8（以8位编码的无符号整数，所以是1个字节）读取数据）。如果在int64或int8中定义相同的整数，则其字节序列将有所不同。使用tfrecord文件时，以字节为单位写映像是一种好习惯，但是您也需要使用适当的类型以字节为单位读取它们。

对于您的代码，请尝试使用image = tf.decode_raw(features['image_raw'], tf.int64)。

Answer 2

这个错误似乎就在这里。

#Convert from a scalar string tensor to a uint8 tensor
image = tf.decode_raw(features['image_raw'], tf.uint8)#the image looks like the tensor with 1302936 values.
image.set_shape([self.input_rows*self.input_cols*self.num_filters])#self.input_rows*self.input_cols*self.num_filters equals 162867， right?

这是我的全部猜测，因为你提供的代码太少了。

TensorFlow tfrecords：tostring（）改变图像的维度

2 个答案: