找不到检查点文件

时间:2019-07-19 15:15:13

标签: tensorflow

我将两个训练有素的模型还原为一个模型,以便使用Google colab中的tensorflow进行微调。第一个模型已成功还原,但是第二个模型还原后,它将报告此信息“未找到检查点文件”,但不作为警告或错误。我不知道这是否意味着恢复失败。

要调试它,我只是将第二个模型的检查点目录更改为第一个模型的目录,但是它报告“未找到检查点文件”,然后报告缺少张量名称。这样是否就意味着尽管它报告“未找到检查点文件”,但只要所需的所有文件都在目录中,模型仍将还原?

这是我的相关代码:

with tf.train.MonitoredTrainingSession(checkpoint_dir=train_dir,
    hooks=[tf.train.StopAtStepHook(last_step=max_steps),
           tf.train.NanTensorHook(loss),
           tf.train.CheckpointSaverHook(checkpoint_dir=train_dir, saver=saver3, save_steps=1000),
           _LoggerHook(),
           _LoggerHook2(),
           #_LoggerHook3(),
           _LoggerHook4()],
    #scaffold=scaffold,
    config=tf.ConfigProto(
        log_device_placement=log_device_placement)) as mon_sess:
    #mon_sess.run(tf.global_variables_initializer())
    saver1.restore(mon_sess, "drive/My Drive/ckpt_basic_new/model.ckpt-100000")
    #saver2.restore(mon_sess, "drive/My Drive/ckpt_basic_new/model.ckpt-100000")
    saver2.restore(mon_sess, "drive/My Drive/ckpt_cal/model.ckpt-57376")        
    while not mon_sess.should_stop():  
      mon_sess.run(train_op, feed_dict={training:True})

信息:

I0719 15:11:00.575512 140037786351488 saver.py:1280] Restoring parameters from drive/My Drive/ckpt_basic_new/model.ckpt-100000
2019-07-19 15:11:06.927919: step 0, loss = 418.70550537 (5749.2 examples/sec; 0.022 sec/batch)
I0719 15:11:11.685403 140037786351488 saver.py:1280] Restoring parameters from drive/My Drive/ckpt_cal/model.ckpt-57376
No checkpoint file found
W0719 15:11:12.715127 140037786351488 basic_session_run_hooks.py:724] It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.

感谢您的回复!

0 个答案:

没有答案