我有一个巨大的视频数据集;对于每个视频,我有一个带有相应帧的文件夹 我正在为每个视频写一个TFRecord,使用SequenceExample,其中FeatureLists是视频的帧。
我正在使用python线程池迭代视频列表,其中每个线程都在一个视频上工作。然后,我使用张量流队列来操作帧。
我的脚本结构如下:
videos_id = os.listdir(dset_dir)
def main_loop(video):
frames_list = get_frames(video)
filename_queue = tf.train.string_input_producer(frames_list)
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_img = tf.image.decode_jpeg(value)
# resize, etc ...
init_op = tf.global_variables_initializer()
sess = tf.InteractiveSession()
with sess.as_default():
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# accumulating images of 1 video
image_list = []
for i in range(len(frames_list)):
image_list.append(my_img.eval(session=sess))
coord.request_stop()
coord.join(threads)
writer = tf.python_io.TFRecordWriter(tfrecord_name)
ex = make_example(image_list)
writer.write(ex.SerializeToString())
writer.close()
sess.close()
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
future = {executor.submit(
main_loop, video): video for video in videos_id}
在+ - 一千个视频之后,我得到以下异常(重复了很多次,对于不同的“Thread-id”):
Exception in thread Thread-344395:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 254, in _run
coord.request_stop(e)
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 211, in request_stop
six.reraise(*sys.exc_info())
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/six.py", line 693, in reraise
raise value
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run
enqueue_callable()
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1235, in _single_operation_run
target_list_as_strings, status, None)
File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
next(self.gen)
File "/home/desktop/Documents/tensorflow-py3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
[[Node: input_producer_319/input_producer_319_EnqueueMany = QueueEnqueueManyV2[Tcomponents=[DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_producer_319, input_producer_319/Identity)]]
知道为什么会这样吗? 提前谢谢。
答案 0 :(得分:0)
我正在使用这种明显更简洁的方法来阻止协调员。 不确定它是否可以提供帮助。
# ....
# this will throw an OutOfRange exeption after 1 epoch, i.e. one video
filename_queue = tf.train.string_input_producer(frames_list, num_epochs=1)
# ....
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# ...
# After everything is built, start the loop.
try:
while not coord.should_stop():
#read you frame
except tf.errors.OutOfRangeError:
# means the loop has finished
# write yuor tfrecord
finally:
# When done, ask the threads to stop.
coord.request_stop()