Question

我想使用＆＃34;完全连接＆＃34;模型做培训和测试。我有一个train.tfrecords文件和一个test.tfrecords文件。我认为，正确的方法是为训练和测试创建一个单独的图表。

根本问题是使用OutOfRange错误和coord.should_stop（）打破了我尝试过的任何类型的封装。

#main.py: I would like main to look clean like this:#
session=tf.Session()
model.init(session, train.tfrecords, test.tfrecords)
model.fit() 
model.eval() 
session.close()

只要你只调用一次或者eval（你可以想象），那就完全没问题了。 I'm roughly following this implementation

#model.py
from graph.py import MyGraph

with tf.variable_scope(scope,reuse=False):
    train_graph=MyGraph(train_queue.batch)
with tf.variable_scope(scope,reuse=True):
    test_graph=MyGraph(test_queue.batch)

def fit(self):
    coord=tf.train.Coordinator()
    threads=tf.train.start_queue_runners(coord,self.session)
    try:
        while not coord.should_stop()
            self.session.run(self.train_graph....)
        etc
    except e:
        coord.request_should_stop()
    finally:
        coord.join()

def eval_on_test_set(self):#similar to fit
    coord=tf.train.Coordinator()
    threads=tf.train.start_queue_runners(coord,self.session)
    try:
        while not coord.should_stop()
            self.session.run(test_graph....)
    except e:
        coord.request_should_stop()
    finally:
        coord.join()

显然，正在发生的是，coord是（1）关闭线程，而线程又（2）关闭队列，以后不能轻易重新打开。我只是不知道如何解决这个问题。 coord.clear_stop（）可能是谜题的一部分，但我无法使其发挥作用。

其次，我有一个训练和测试队列，但它们不能同时使用。理想情况下，解决方案并不涉及等待一半时间的专用测试/训练线程。

Answer 1

看起来像协调器与当前线程上下文有某种联系。在同一线程上下文中，尽管您具有不同的图，会话和协调器，但如果其中一个协调器终止，则可能导致其他协调器被强制退出。我试图通过单独的线程进行培训和评估来避免此问题。希望对您有帮助

Tensorflow重新启动队列运行器：不同的列车和测试队列

1 个答案: