Question

我正在尝试将Jpeg图像集转换为TFrecords。但是TFrecord文件占用的空间几乎是映像集的5倍。经过大量的搜索，我了解到将JPEG写入TFrecord中时，它们不再是JPEG。但是，我还没有遇到一个可以理解的代码解决方案。请告诉我下面的代码应进行哪些更改才能将JPEG写入Tfrecords。

def print_progress(count, total):
    pct_complete = float(count) / total
    msg = "\r- Progress: {0:.1%}".format(pct_complete)
    sys.stdout.write(msg)
    sys.stdout.flush()

def wrap_int64(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))

def wrap_bytes(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def convert(image_paths , labels, out_path):
    # Args:
    # image_paths   List of file-paths for the images.
    # labels        Class-labels for the images.
    # out_path      File-path for the TFRecords output file.

    print("Converting: " + out_path)

    # Number of images. Used when printing the progress.
    num_images = len(image_paths)

    # Open a TFRecordWriter for the output-file.
    with tf.python_io.TFRecordWriter(out_path) as writer:

        # Iterate over all the image-paths and class-labels.
        for i, (path, label) in enumerate(zip(image_paths, labels)):
            # Print the percentage-progress.
            print_progress(count=i, total=num_images-1)

            # Load the image-file using matplotlib's imread function.
            img = imread(path)
            # Convert the image to raw bytes.
            img_bytes = img.tostring()

            # Create a dict with the data we want to save in the
            # TFRecords file. You can add more relevant data here.
            data = \
            {
                'image': wrap_bytes(img_bytes),
                'label': wrap_int64(label)
            }

            # Wrap the data as TensorFlow Features.
            feature = tf.train.Features(feature=data)

            # Wrap again as a TensorFlow Example.
            example = tf.train.Example(features=feature)

            # Serialize the data.
            serialized = example.SerializeToString()

            # Write the serialized data to the TFRecords file.
            writer.write(serialized)

编辑：有人可以回答这个问题吗！！

Answer 1

您不应将图像数据仅保存在文件名中。然后，要在将记录送入训练循环时加载图像，最好使用相对较新的Dataset API。来自docs：

# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def _parse_function(filename, label):
  image_string = tf.read_file(filename)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize_images(image_decoded, [28, 28])
  return image_resized, label

# A vector of filenames.
filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...])

# `labels[i]` is the label for the image in `filenames[i].
labels = tf.constant([0, 37, ...])

dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(_parse_function)

哪种方法更快？这里有许多竞争因素，例如：

读取一个大的连续文件可能比打开和读取许多小文件更快。但这对于SSD，旋转磁盘或网络存储而言会有所不同。
读取许多小文件可能更易于并行化
虽然读取1000个大小为x的文件可能比一个大小为1000x的文件要慢，但实际上我们正在讨论一个大小为10 x 1000x的大文件，因为图像数据是原始像素，而不是jpeg。
从像素数据开始的BUT保存了jpeg解码步骤
如果实际上不是您的瓶颈，那么优化读取速度可能并没有多大意义

因此，最后，了解不同的方法很重要。如果没有度量，我倾向于使用多文件小解决方案，因为它需要较少的处理我们开始的数据，并且如果它完全不合理，则不太可能在Tensorflow文档中使用。但是唯一真正的答案是测量。

Answer 2

我们可以只使用内置的open函数来获取字节，而不是将图像转换为数组并返回字节。这样，压缩的图像将被写入TFRecord。

替换这两行

img = imread(path)
img_bytes = img.tostring()

与

img_bytes = open(path,'rb').read()

参考：

https://github.com/tensorflow/tensorflow/issues/9675

TF记录比原始JPEG图像占用更多空间

2 个答案: