我一直在Google Colab上运行Keras,最近一天左右,使用GPU时出现以下错误。使用TPU加速时效果很好。
查看日志,我看到此错误。在我看来,CoLab没有加载正确版本的库-但我认为我对该过程没有任何控制权。我该怎么办?
谢谢!
2019年5月9日,晚上9:21:08警告2019-05-09 14:21:08.027926:E tensorflow / stream_executor / cuda / cuda_dnn.cc:324]已加载的运行时CuDNN 库:7.3.1,但源代码编译为:7.4.2。 CuDNN库 主要版本和次要版本需要匹配或具有更高的次要版本 如果是CuDNN 7.0或更高版本。如果使用二进制安装, 升级您的CuDNN库。如果是从源代码构建,请确保 运行时加载的库与指定的版本兼容 在编译配置期间。
---------------------------------------------------------------------------
UnknownError Traceback (most recent call last)
<ipython-input-14-fc8da73d140a> in <module>()
19 [x_train, x_train_pos_enc], y_train, validation_split=0.2,
20 batch_size=batch_size,
---> 21 epochs=10, callbacks=callbacks, verbose=1)
22 endtime = time.time()
23 print(f'This took {endtime-starttime:.2f} seconds')
4 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, max_queue_size, workers, use_multiprocessing, **kwargs)
878 initial_epoch=initial_epoch,
879 steps_per_epoch=steps_per_epoch,
--> 880 validation_steps=validation_steps)
881
882 def evaluate(self,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py in model_iteration(model, inputs, targets, sample_weights, batch_size, epochs, verbose, callbacks, val_inputs, val_targets, val_sample_weights, shuffle, initial_epoch, steps_per_epoch, validation_steps, mode, validation_in_fit, **kwargs)
327
328 # Get outputs.
--> 329 batch_outs = f(ins_batch)
330 if not isinstance(batch_outs, list):
331 batch_outs = [batch_outs]
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py in __call__(self, inputs)
3074
3075 fetched = self._callable_fn(*array_vals,
-> 3076 run_metadata=self.run_metadata)
3077 self._call_fetch_callbacks(fetched[-len(self._fetches):])
3078 return nest.pack_sequence_as(self._outputs_structure,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
1437 ret = tf_session.TF_SessionRunCallable(
1438 self._session._session, self._handle, args, status,
-> 1439 run_metadata_ptr)
1440 if run_metadata:
1441 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
526 None, None,
527 compat.as_text(c_api.TF_Message(self.status.status)),
--> 528 c_api.TF_GetCode(self.status.status))
529 # Delete the underlying status object from memory otherwise it stays alive
530 # as there is a reference to status from this from the traceback due to
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d/conv1d/Conv2D}}]]
[[{{node loss/dense_1_loss/broadcast_weights/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/concat}}]]