Tensorflow TFDV不适用于图像

时间:2018-12-13 15:46:20

标签: tensorflow tensorflow-data-validation

我正在尝试让TFDV使用RGB图像作为特征输入,并从TFRecords文件读取。我可以将图像数据读/写到TFRecord文件。以下是编写代码的相关代码段,其中img是一个numpy [32,32,3]数组:

feature = {'train/label': _int64_feature(y_train[i]),
           'train/image': _bytes_feature(tf.compat.as_bytes(img.tostring()))
          }

回读:

read_features = {'train/label': tf.FixedLenFeature([], tf.int64),
             'train/image': tf.FixedLenFeature([], tf.string)}

然后我可以使用frombuffer并重新整形以正确恢复图像。

问题在于,当我使用该TFRecords文件运行tfdv.generate_statistics_from_tfrecord()时。抛出错误:

ValueError: '\xff ...... \x87' has type str, but isn't valid UTF-8 encoding. Non-UTF-8 strings must be converted to unicode objects before being added. [while running 'GenerateStatistics/RunStatsGenerators/TopKStatsGenerator/TopK_ConvertToSingleFeatureStats']

我已经尝试过使用astype(unicode)等多种方式来写图像,但是我无法做到这一点。

有什么想法吗?

谢谢, 保罗

1 个答案:

答案 0 :(得分:0)

尝试以下操作:

image_string = open(image_location, 'rb').read()
feature = {'train/label': _int64_feature(y_train[i]),
           'train/image': _bytes_feature(image_string)
          }

引自official tutorial