我想用TensorFlow执行多标签分类。 我有大约95000张图像,每张图像都有一个相应的标签矢量。每张图片都有7个标签。这7个标签表示为尺寸为7的张量。每个图像的形状为(299,299,3)。
我现在如何将带有相应标签矢量/张量的图像写入.tfrecords文件
我目前的代码/方法:
def get_decode_and_resize_image(image_id):
image_queue = tf.train.string_input_producer(['../../original-data/'+image_id+".jpg"])
image_reader = tf.WholeFileReader()
image_key, image_value = image_reader.read(image_queue)
image = tf.image.decode_jpeg(image_value,channels=3)
resized_image= tf.image.resize_images(image, 299, 299, align_corners=False)
return resized_image
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
# Start populating the filename queue.
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# get all labels and image ids
csv= pd.read_csv('../../filteredLabelsToPhotos.csv')
#create a writer for writing to the .tfrecords file
writer = tf.python_io.TFRecordWriter("tfrecords/data.tfrecords")
for index,row in csv.iterrows():
# the labels
image_id = row['photo_id']
lunch = tf.to_float(row["lunch"])
dinner= tf.to_float(row["dinner"])
reservations= tf.to_float(row["TK"])
outdoor = tf.to_float(row["OS"])
waiter = tf.to_float(row["WS"])
classy = tf.to_float(row["c"])
gfk = tf.to_float(row["GFK"])
labels_list = [lunch,dinner,reservations,outdoor,waiter,classy,gfk]
labels_tensor = tf.convert_to_tensor(labels_list)
#get the corresponding image
image_file= get_decode_and_resize_image(image_id=image_id)
#here : how do I now create a TFExample and write it to the .tfrecords file
coord.request_stop()
coord.join(threads)
在我创建.tfrecords文件之后,我可以从TensorFlow培训代码中读取它并自动批处理数据吗?
答案 0 :(得分:0)
要创建tf.train.Example
,只需执行example = tf.train.Example()
。然后,您可以使用普通protocol buffers python API来操纵它。
答案 1 :(得分:0)
要扩展Alexandre的答案,你可以这样做:
# Set this up before your for-loop, you'll use this repeatedly
tfrecords_filename = 'myfile.tfrecords'
writer = tf.python_io.TFRecordWriter(tfrecords_filename)
# Then within your for-loop, you can write like so:
for ...:
#here : how do I now create a TFExample and write it to the .tfrecords file
example = tf.train.Example(features=tf.train.Features(feature={
'image_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_file])),
# the other features, labels you wish to include go here too
}))
writer.write(example.SerializeToString())
# then finally, don't forget to close the writer.
writer.close()
这假设您已将图像转换为image_file
变量中的字节数组。
我从this very helpful post改编了这个,详细介绍了序列化图像和图像。如果我的上述假设是假的,可能会对你有帮助。