我正在尝试让TFDV使用RGB图像作为特征输入,并从TFRecords文件读取。我可以将图像数据读/写到TFRecord文件。以下是编写代码的相关代码段,其中img是一个numpy [32,32,3]数组:
feature = {'train/label': _int64_feature(y_train[i]),
'train/image': _bytes_feature(tf.compat.as_bytes(img.tostring()))
}
回读:
read_features = {'train/label': tf.FixedLenFeature([], tf.int64),
'train/image': tf.FixedLenFeature([], tf.string)}
然后我可以使用frombuffer并重新整形以正确恢复图像。
问题在于,当我使用该TFRecords文件运行tfdv.generate_statistics_from_tfrecord()时。抛出错误:
ValueError: '\xff ...... \x87' has type str, but isn't valid UTF-8 encoding. Non-UTF-8 strings must be converted to unicode objects before being added. [while running 'GenerateStatistics/RunStatsGenerators/TopKStatsGenerator/TopK_ConvertToSingleFeatureStats']
我已经尝试过使用astype(unicode)等多种方式来写图像,但是我无法做到这一点。
有什么想法吗?
谢谢, 保罗
答案 0 :(得分:0)
尝试以下操作:
image_string = open(image_location, 'rb').read()
feature = {'train/label': _int64_feature(y_train[i]),
'train/image': _bytes_feature(image_string)
}