我正在尝试立即恢复我刚刚保存的模型以进行调试,但最终我总是收到以下错误:
venv2/local/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py:636: RuntimeWarning:
Unexpected end-group tag: Not all data was converted
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "src/python/kmer/learning.py", line 269, in <module>
TensorflowTrainingJob.launch(resume_from_reduce = c.resume_from_reduce)
File "src/python/kmer/learning.py", line 66, in launch
job.execute()
File "kmer/map_reduce.py", line 71, in execute
self.distribute_workload()
File "kmer/map_reduce.py", line 107, in distribute_workload
self.run_batch(self.batch[index])
File "kmer/map_reduce.py", line 121, in run_batch
batch[track] = self.transform(batch[track], track)
File "src/python/kmer/learning.py", line 110, in transform
saver.restore(session, tf.train.latest_checkpoint(os.path.join("model save location")))
AttributeError: 'NoneType' object has no attribute 'restore'
似乎在我尝试恢复时,模型尚未完全保存。这是代码:
with tf.Session() as session:
session.run(init)
for epoch in range(5):
o, c = session.run([optimizer, cost], feed_dict = {x: X, y: Y})
saver.save(session, "model save location")
time.sleep(10)
graph = tf.Graph()
session = tf.Session(graph = graph)
with graph.as_default():
saver = tf.train.import_meta_graph("model save location")
saver.restore(session, tf.train.latest_checkpoint("model save location")))
我尝试在中间添加对sleep
的调用,以防保存操作异步,但似乎没有帮助。
完全相同的代码在稍后作为不同脚本的一部分运行时可以工作,因此肯定会有一些时序问题。