我尝试用mnist数据集训练CNN,但是当我开始训练时遇到了这个问题
系统信息:
Linux版本:ubuntu 16.04
cuda版本:9.0
cudnn版本:7.6.1
tensorflow-gpu版本:1.14.0
代码:
train_history=model.fit(x=x_Train4D_normalize,y=y_TrainOneHot,validation_split=0.2,=10,batch_size=300,verbose=2)
错误:
UnknownError Traceback (most recent call last) in () 1 train_history=model.fit(x=x_Train4D_normalize, 2 y=y_TrainOneHot,validation_split=0.2, ----> 3 epochs=10,batch_size=300,verbose=2)
/home/e420/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs) 1037 initial_epoch=initial_epoch, 1038 steps_per_epoch=steps_per_epoch, -> 1039 validation_steps=validation_steps) 1040 1041 def evaluate(self, x=None, y=None,
/home/e420/anaconda2/lib/python2.7/site-packages/keras/engine/training_arrays.pyc in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps) 197 ins_batch[i] = ins_batch[i].toarray() 198 --> 199 outs = f(ins_batch) 200 outs = to_list(outs) 201 for l, o in zip(out_labels, outs):
/home/e420/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.pyc in call(self, inputs) 2713 return self._legacy_call(inputs) 2714 -> 2715 return self._call(inputs) 2716 else: 2717 if py_any(is_tensor(x) for x in inputs):
/home/e420/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.pyc in _call(self, inputs) 2673 fetched = self._callable_fn(*array_vals, run_metadata=self.run_metadata) 2674 else: -> 2675 fetched = self._callable_fn(*array_vals) 2676 return fetched[:len(self.outputs)] 2677
/home/e420/.local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in call(self, *args, **kwargs) 1437 ret = tf_session.TF_SessionRunCallable( 1438 self._session._session, self._handle, args, status, -> 1439 run_metadata_ptr) 1440 if run_metadata: 1441 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
/home/e420/.local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.pyc in exit(self, type_arg, value_arg, traceback_arg) 526 None, None, 527 compat.as_text(c_api.TF_Message(self.status.status)), --> 528 c_api.TF_GetCode(self.status.status)) 529 # Delete the underlying status object from memory otherwise it stays alive 530 # as there is a reference to status from this from the traceback due to
UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node conv2d_1/convolution}}]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node conv2d_1/convolution}}]] [[loss/mul/_115]] 0 successful operations. 0 derived errors ignored.
答案 0 :(得分:0)
尝试降级您的tensorflow-gpu
版本,此错误可能是由于不同版本之间不兼容造成的。
pip uninstall tensorflow-gpu
pip install --upgrade tensorflow-gpu==1.8.0