Question

我正在尝试让TFDV使用RGB图像作为特征输入，并从TFRecords文件读取。我可以将图像数据读/写到TFRecord文件。以下是编写代码的相关代码段，其中img是一个numpy [32,32,3]数组：

feature = {'train/label': _int64_feature(y_train[i]),
           'train/image': _bytes_feature(tf.compat.as_bytes(img.tostring()))
          }

回读：

read_features = {'train/label': tf.FixedLenFeature([], tf.int64),
             'train/image': tf.FixedLenFeature([], tf.string)}

然后我可以使用frombuffer并重新整形以正确恢复图像。

问题在于，当我使用该TFRecords文件运行tfdv.generate_statistics_from_tfrecord（）时。抛出错误：

ValueError: '\xff ...... \x87' has type str, but isn't valid UTF-8 encoding. Non-UTF-8 strings must be converted to unicode objects before being added. [while running 'GenerateStatistics/RunStatsGenerators/TopKStatsGenerator/TopK_ConvertToSingleFeatureStats']

我已经尝试过使用astype（unicode）等多种方式来写图像，但是我无法做到这一点。

有什么想法吗？

谢谢，保罗

Answer 1

尝试以下操作：

image_string = open(image_location, 'rb').read()
feature = {'train/label': _int64_feature(y_train[i]),
           'train/image': _bytes_feature(image_string)
          }

引自official tutorial

Tensorflow TFDV不适用于图像

1 个答案: