tensorflow.python.framework.errors_impl.UnknownError:重命名失败:输入/输出错误

时间:2019-02-25 08:47:39

标签: python tensorflow

当我应用tensorflow急切模式训练分类器时,遇到了以下错误。

Steps 151, Train loss is 0.00039766659028828144, learning_rate is 0.009999999776482582
Traceback (most recent call last):
  File "E:/Tensorflow_Experiments/train_alexnet.py", line 1354, in <module>

该错误在几次迭代后发生,如上所示。但是对于连续执行而言,迭代次数是不同的。

  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\contrib\eager\python\saver.py", line 156, in save
    None, file_prefix, write_meta_graph=False, global_step=global_step)
  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1451, in save
    save_relative_paths=self._save_relative_paths)
  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\python\training\checkpoint_management.py", line 237, in update_checkpoint_state_internal
    text_format.MessageToString(ckpt))
  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 436, in atomic_write_string_to_file
    rename(temp_pathname, filename, overwrite)
  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 415, in rename
    compat.as_bytes(oldname), compat.as_bytes(newname), overwrite, status)
  File "C:\Software\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to rename: ./hundred_models\model1\checkpoint.tmpc4b15b8c1e2d48b394f810909a0838b6 to: ./hundred_models\model1\checkpoint : \udcbeܾ\udcf8\udcb7\udcc3\udcceʡ\udca3
; Input/output error

1 个答案:

答案 0 :(得分:0)

这个答案来晚了,但 here 为我解决了这个问题。

检查您是否有类似命名的文件夹,或者在我的代码中是否有一个 csv 记录器干扰了检查点的创建。