找不到dnn实施Ububtu 18.04

时间:2019-04-29 04:02:50

标签: python ubuntu tensorflow cudnn

Tensorflow-gpu v1.13.1,CUDA:10.0,CuDNN:7.5.1,显卡:RTX 2080,Ubuntu:18.04

我目前正在尝试使用CuDNNLSTM在tf中训练LSTM模型,但是每当我运行代码时,都会出现以下错误

2019-04-28 23:43:48.936154: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-04-28 23:43:48.936212: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at cudnn_rnn_ops.cc:1217 : Unknown: Fail to find the dnn implementation.
Traceback (most recent call last):
  File "/home/nicholas/PycharmProjects/deepLearninginKeras/crypto_currency_predict/crypto.py", line 139, in <module>
    callbacks=[tensorboard, checkpoint])
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 880, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 329, in model_iteration
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3076, in __call__
    run_metadata=self.run_metadata)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.
     [[{{node cu_dnnlstm/CudnnRNN}}]]
     [[{{node ConstantFoldingCtrl/loss/dense_1_loss/broadcast_weights/assert_broadcastable/AssertGuard/Switch_0}}]]

我不确定到底是什么引起了该问题,我觉得可能是部分原因所在,我安装/使用的CUDA版本与我的显卡不同。在终端中使用命令“ nvidia-smi”时,我得到以下信息:

NVIDIA-SMI 418.56驱动程序版本:418.56 CUDA版本:10.1

在页面底部的〜/ .bashrc中,我具有以下路径:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/$/cuda/extras/CUPTI/lib64
export CUDA_HOME=/usr/local/cuda

任何见识将不胜感激。 这是我的模型中的示例图层:

model.add(tf.keras.layers.CuDNNLSTM(128, input_shape=train_x.shape[1:], return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.BatchNormalization())

例如最好去ubuntu 16还是不能解决问题。对于RTX 20xx,这似乎是一个非常普遍的问题。

1 个答案:

答案 0 :(得分:0)

对我来说,在安装前添加以下配置可以解决问题:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.Session(config=config)