如何使用tensorflow队列进行多标记数据?

时间:2017-06-02 12:26:23

标签: python tensorflow queue multilabel-classification

我有一个带有相应标签的大型图像数据库。我想使用CNN对其进行分类,但我的问题是关于使用Tensorflow的输入管道方法。 由于数据库太大,我必须使用队列。

我的数据格式是下面给出的文本文件:

my_data.txt:

filename1, label1, label2, label3
filename2, label4, label5
filename3, label2, label6, label7, label8
...

我将它解析为在这样的python列表中使用: my_data = [[filename1, filename3, filename3, ...], [[label1, label2, label3], [label4, label5], [label2, label6, label7, label8], ...] ]

我想将其放入队列并使用相应的标签

将文件名元素出列

我尝试了什么:

import tensorflow as tf

queue = tf.train.slice_input_producer(my_data, capacity=100,  shuffle=True)
filename = queue[0]
labels = queue[1:]
file_content = tf.read_file(filename)
record = tf.image.decode_png(file_content, channels=3)
image = preprocess(record) # to put shape at a regular value
data_batch = tf.train.batch([image, labels])

init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())

with tf.Session() as sess :
    sess.run(init)
    coord =tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord, sess=sess)
    png_images, labels= sess.run(data_batch)




Traceback (most recent call last):
  File "test_queue.py", line 19, in <module>
    shuffle=True)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 298, in slice_input_producer
    tensor_list = ops.convert_n_to_tensor_or_indexed_slices(tensor_list)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 925, in convert_n_to_tensor_or_indexed_slices
    values=values, dtype=dtype, name=name, as_ref=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 896, in internal_convert_n_to_tensor_or_indexed_slices
    value, dtype=dtype, name=n, as_ref=as_ref))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 857, in internal_convert_to_tensor_or_indexed_slices
    as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 702, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 111, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 100, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 379, in make_tensor_proto
    _GetDenseDimensions(values)))
ValueError: Argument must be a dense tensor: [['label1', 'label2', 'label3'], ['label4', 'label5'], ['label2', 'label6', 'label7', 'label8']] - got shape [3], but wanted [3, 3].

如果为每个文件名元素添加相同数量的标签,它可以工作,那我该怎么办?

0 个答案:

没有答案