我在Tensorflow中实现了这样的图形:有一个队列Q,后台线程将张量排入其中。在主线程中,我依次从Q中出列元素。
我的代码可简化如下:
app.js
我评论说,如果我在进行出列操作前睡了1秒,事情会好的。但是,如果立即运行,将引发以下异常:
import time
import threading
import tensorflow as tf
sess = tf.InteractiveSession()
coord = tf.train.Coordinator()
q = tf.FIFOQueue(32, dtypes=tf.int32)
def loop(g):
with g.as_default():
enqueue_op = q.enqueue(1, name="example_enqueue")
for i in range(20):
if coord.should_stop():
return
try:
sess.run(enqueue_op)
except tf.errors.CancelledError:
print("enqueue canncelled")
threads = [
threading.Thread(target=loop, args=(tf.get_default_graph(),))
]
sess.run(tf.initialize_all_variables())
for t in threads: t.start()
# If I sleep 1 seconds, it will be fine!
# time.sleep(1)
print(sess.run(q.dequeue()))
coord.request_stop()
coord.join(threads)
sess.close()
在处理上述异常期间,发生了另一个异常:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 715, in _do_call
return fn(*args)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 697, in _run_fn
status, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/contextlib.py", line 66, in __exit__
next(self.gen)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.NotFoundError: FetchOutputs node fifo_queue_Dequeue:0: not found
在处理上述异常期间,发生了另一个异常:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/hanxu/Downloads/BrainSeg/playgrounds/7.py", line 32, in <module>
print(sess.run(q.dequeue()))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 372, in run
run_metadata_ptr)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 636, in _run
feed_dict_string, options, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 708, in _do_run
target_list, options, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 728, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.NotFoundError: FetchOutputs node fifo_queue_Dequeue:0: not found
HanXus-MacBook-Pro:BrainSeg hanxu$ python3 -m playgrounds.7
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 715, in _do_call
return fn(*args)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 697, in _run_fn
status, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/contextlib.py", line 66, in __exit__
next(self.gen)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.NotFoundError: FetchOutputs node fifo_queue_Dequeue:0: not found
有人可以帮忙吗?非常感谢!!
我正在使用Tensorflow 9.0rc0。
我的实际情况有点复杂。事实上,排队的张量在每次都是不同的,比如
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/hanxu/Downloads/BrainSeg/playgrounds/7.py", line 34, in <module>
print(sess.run(q.dequeue()))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 372, in run
run_metadata_ptr)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 636, in _run
feed_dict_string, options, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 708, in _do_run
target_list, options, run_metadata)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 728, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.NotFoundError: FetchOutputs node fifo_queue_Dequeue:0: not found
因此将入队操作移至主线程并非易事:(我不知道如何。请帮助:)。
答案 0 :(得分:2)
这是an issue与TensorFlow的旧版本(0.9之前版本),版本0.9中为fixed。问题是,当其他线程(即您的q.dequeue()
线程)使用图表时,向图表添加节点(即在您对q.enqueue()
和loop()
的调用中)不是线程安全的。
您需要修复两个问题才能避免竞争条件(在0.9之前的版本中):
请勿在{{1}}主题中调用q.enqueue()
。而是在主线程中创建它。例如:
loop()
在您启动q = tf.FIFOQueue(32, dtypes=tf.int32)
enqueue_op = q.enqueue(1, name="example_enqueue")
def loop(g):
for i in range(20):
if coord.should_stop():
return
try:
sess.run(enqueue_op)
except tf.errors.CancelledError:
print("enqueue canncelled")
主题之前,将调用移至q.dequeue()
(向图表添加节点):
loop()