尝试保存检查点时TensorFlow中的InternalError

时间:2017-12-21 05:49:04

标签: python tensorflow machine-learning

我正在尝试使用Pandas的一些测试数据训练一个简单的DNNClassifier。当TensorFlow尝试保存检查点时,会遇到以下错误。

这是一个内部错误 - 手册中的任何地方都没有可用的信息。

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InternalError'>, Unable to get element as bytes.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/bets_model/model.ckpt.
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1322     try:
-> 1323       return fn(*args)
   1324     except errors.OpError as e:

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1301                                    feed_dict, fetch_list, target_list,
-> 1302                                    status, run_metadata)
   1303 

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    472             compat.as_text(c_api.TF_Message(self.status.status)),
--> 473             c_api.TF_GetCode(self.status.status))
    474     # Delete the underlying status object from memory otherwise it stays alive

NotFoundError: /tmp/bets_model/model.ckpt-0_temp_d361b25e9071477a9e4ccd44a49a241a; No such file or directory
     [[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, dnn/hiddenlayer_0/bias/part_0/read, dnn/dnn/hiddenlayer_0/bias/part_0/Adagrad/read, dnn/hiddenlayer_0/kernel/part_0/read, dnn/dnn/hiddenlayer_0/kernel/part_0/Adagrad/read, dnn/hiddenlayer_1/bias/part_0/read, dnn/dnn/hiddenlayer_1/bias/part_0/Adagrad/read, dnn/hiddenlayer_1/kernel/part_0/read, dnn/dnn/hiddenlayer_1/kernel/part_0/Adagrad/read, dnn/hiddenlayer_2/bias/part_0/read, dnn/dnn/hiddenlayer_2/bias/part_0/Adagrad/read, dnn/hiddenlayer_2/kernel/part_0/read, dnn/dnn/hiddenlayer_2/kernel/part_0/Adagrad/read, dnn/logits/bias/part_0/read, dnn/dnn/logits/bias/part_0/Adagrad/read, dnn/logits/kernel/part_0/read, dnn/dnn/logits/kernel/part_0/Adagrad/read, global_step)]]

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/training/saver.py in save(self, sess, save_path, global_step, latest_filename, meta_graph_suffix, write_meta_graph, write_state)
   1572               self.saver_def.save_tensor_name,
-> 1573               {self.saver_def.filename_tensor_name: checkpoint_file})
   1574         else:

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    888       result = self._run(None, fetches, feed_dict, options_ptr,
--> 889                          run_metadata_ptr)
    890       if run_metadata:

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1119       results = self._do_run(handle, final_targets, final_fetches,
-> 1120                              feed_dict_tensor, options, run_metadata)
   1121     else:

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1316       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1317                            options, run_metadata)
   1318     else:

/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1335           pass
-> 1336       raise type(e)(node_def, op, message)
   1337 

NotFoundError: /tmp/bets_model/model.ckpt-0_temp_d361b25e9071477a9e4ccd44a49a241a; No such file or directory
     [[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, dnn/hiddenlayer_0/bias/part_0/read, dnn/dnn/hiddenlayer_0/bias/part_0/Adagrad/read, dnn/hiddenlayer_0/kernel/part_0/read, dnn/dnn/hiddenlayer_0/kernel/part_0/Adagrad/read, dnn/hiddenlayer_1/bias/part_0/read, dnn/dnn/hiddenlayer_1/bias/part_0/Adagrad/read, dnn/hiddenlayer_1/kernel/part_0/read, dnn/dnn/hiddenlayer_1/kernel/part_0/Adagrad/read, dnn/hiddenlayer_2/bias/part_0/read, dnn/dnn/hiddenlayer_2/bias/part_0/Adagrad/read, dnn/hiddenlayer_2/kernel/part_0/read, dnn/dnn/hiddenlayer_2/kernel/part_0/Adagrad/read, dnn/logits/bias/part_0/read, dnn/dnn/logits/bias/part_0/Adagrad/read, dnn/logits/kernel/part_0/read, dnn/dnn/logits/kernel/part_0/Adagrad/read, global_step)]]

1 个答案:

答案 0 :(得分:0)

我自己想通了。显然,只要不喜欢进入的数据,tensorflow就会抛出这些低级错误 - 例如,如果它看到的值不在词汇表或纳米上。我期待一个更有意义的错误信息,但显然,它比我想象的更低级 - 我需要对数据进行自己的尽职调查,然后再将其提供给分类器进行培训。