Question

从 TFRecord 读取 JPEG 图像时，似乎会丢失信息。下面是一个例子：

原图： https://i.stack.imgur.com/QyKMI.jpg
解码后的 JPEG 图像（重新调整为 600x600 尺寸）： https://i.stack.imgur.com/dUyoP.jpg

tf.data.Dataset 是使用 Keras image_dataset_from_directory 函数创建的，并且使用 tf.io.encode_jpeg 将每个形状为 600x600x3 的图像张量编码为字节字符串：

image = tf.image.convert_image_dtype(image_tensor, dtype=tf.uint8)
image = tf.io.encode_jpeg(image, quality=100)

每个 TFRecord 示例都是这样创建的：

    # encoded_image is the output of encode_jpeg function
    image_feature = tf.train.Feature(
        bytes_list=tf.train.BytesList(value=[
            encoded_image
        ])
    )
    features = tf.train.Features(feature={
        'image': image_feature
    })
    example = tf.train.Example(features=features)
    return example.SerializeToString()

以下是加载 TFRecords 数据集的代码，使用 tf.image.decode_jpeg 将图像解码回形状为 600x600x3 的张量，然后使用 PIL 将一张图像保存到磁盘：

def read_tfrecord(example):
    tfrecord = {
        "image": tf.io.FixedLenFeature([], tf.string)
    }
    example = tf.io.parse_single_example(example, tfrecord)
    image = tf.image.decode_jpeg(example['image'], channels=3)
    return image


def read_dataset(dataset_path):
    filenames = tf.io.gfile.glob(dataset_path + '/validation/*.tfrecord')

    dataset = tf.data.TFRecordDataset(filenames)
    dataset = dataset.map(read_tfrecord)
    dataset = dataset.repeat()
    dataset = dataset.batch(128)

    for image, label in dataset.take(1):
        Image.fromarray((image[0].numpy())).save('./images/img.jpeg')

我完全不知道是什么导致了这种明显的图像信息丢失，因此非常感谢您的帮助！

注意事项：

我使用的是 Tensorflow v2.5.0。和枕头 v8.0.1
请在此 Gist 中找到完整代码：https://gist.github.com/deeDude/c40fa1f14e4fa4b7f2ef149e5a344023

图像信息丢失

0 个答案: