得到'未定义的符号:cudnnCreate'

时间:2017-01-16 21:36:02

标签: tensorflow

我正在尝试运行一个非常简单的Tensorflow图,但是当我运行脚本时,我得到以下输出:

/usr/bin/python3.5 /media/Data/workspaces/python/tf_playground/play/cnn.py
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH: /opt/pycharm/pycharm-community-2016.3.2/bin:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:3459] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.683
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 222.31MiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
F tensorflow/stream_executor/cuda/cuda_dnn.cc:221] Check failed: s.ok() could not find cudnnCreate in cudnn DSO; dlerror: /usr/local/lib/python3.5/dist-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cudnnCreate

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

知道问题在这里吗?

奇怪的是,我能够毫无错误地运行MNIST softmax example

这是我收到错误的脚本:

import json

import requests
import tensorflow as tf
import numpy as np


class MyCNN(object):

    def __init__(self, sequence_length, num_classes, embedding_size, filter_sizes):

        self.input_x = tf.placeholder(tf.float32, [sequence_length, embedding_size], name="input_x")
        self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
        self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")

        input_X = tf.reshape(self.input_x, [1, sequence_length, embedding_size, 1])

        pooled_outputs = []
        num_filters = len(filter_sizes)

        for i, filter_size in enumerate(filter_sizes):

            filter_shape = [filter_size, embedding_size, 1, num_filters]

            F = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="F")
            b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")

            conv = tf.nn.conv2d(
                input_X,
                F,
                strides=[1, 1, 1, 1],
                padding="VALID",
                name="conv")

            # Apply nonlinearity
            h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")

            # Maxpooling over the outputs
            pooled = tf.nn.max_pool(
                h,
                ksize=[1, sequence_length - filter_size + 1, 1, 1],
                strides=[1, 1, 1, 1],
                padding='VALID',
                name="pool")

            pooled_outputs.append(pooled)

        self.h_pool = tf.concat(3, pooled_outputs)

if __name__ == "__main__":

    headers = {
        "Content-Type": "application/json"
    }
    request = requests.post("http://localhost:8080/ema-server/w2v/getWordVectors",
                            data=json.dumps(["I", "really", "love", "to", "eat", "a", "lot", "of", "sushi!"]),
                            headers=headers)

    words = json.loads(request.text)

    X = []
    for word in words:
        if word is None: X.append([0] * 300); continue
        X.append(word)

    while len(X) < 50: X.append([0] * 300)

    X = np.asmatrix(X)
    X = np.reshape(X, [1, 50, 300, 1])

    cnn = MyCNN(50, 2, 300, [3])

    sess = tf.Session()

    sess.run(tf.global_variables_initializer())
    sess.run(cnn.h_pool, feed_dict={cnn.input_x: X})

    print("All done.")

更新:我跟着these instructions安装了CudNN,但我仍然遇到同样的错误..

5 个答案:

答案 0 :(得分:1)

我在Windows环境中遇到此错误

Check failed: s.ok() could not find cudnnCreate in cudnn DSO; dlerror: cudnnCreate not found

假设:您已经下载了支持GPU的tensorflow,并已从https://developer.nvidia.com/cuda-downloads

安装了NVIDIA的CUDA工具包

我按照以下步骤解决了这个问题。

  1. 从URL下载NVIDIA的cuDNN库。 https://developer.nvidia.com/cudnn
  2. 创建免费帐户,在创建帐户时回答所有问题。
  3. 您应该可以下载各个平台的zip文件。请参阅下面的选项 enter image description here

  4. 因为我使用的是Windows 8.1。我下载了适用于Windows 10的库。它是包含三个文件夹的zip文件 - &gt; bin,inlcude和lib文件夹。

  5. 将文件解压缩到文件夹
  6. 使用默认的CUDA工具包安装将在文件夹中 - &gt; C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA
  7. 将步骤6中的文件复制到步骤6中给出的位置。来自bin的所有文件在步骤6中进入bin文件夹,依此类推。
  8. 这些步骤解决了上面给出的错误。希望这对某人有用。

答案 1 :(得分:1)

我通过设置CUDA和CUDA / lib的导出路径解决了同样的问题。

答案 2 :(得分:0)

以上所有答案都是正确的,您需要将环境设置为正确。

但是如果您安装了错误版本的cudnn(即使它是较新版本),您将收到此警告。您必须安装正确版本的cuda和cudnn。您可以按照tensorflow.org上的说明进行操作,不需要TF重新编译,只需要cp cudnn文件到/ user / local / cuda / *,然后再试一次:

https://www.tensorflow.org/versions/r0.12/get_started/os_setup#optional_install_cuda_gpus_on_linux

enter image description here

答案 3 :(得分:0)

我解决了它更新到tensorflow-gpu到1.4。我在conda环境中得到了这个。更新tf-gpu解决了错误。

答案 4 :(得分:0)

有两种方法可以解决此问题(我也遇到过)-
1)要么cudnn没有正确安装,如果找不到libcudnn.so,请检查此路径到Path- / usr / local / cuda / lib64,然后它可以正常工作。还要检查.bashrc文件,是否将LD_LIBRARY_PATH设置为上述路径。如果没有设置。
2)问题可能出在Tensorflow版本上(大多数情况下都会发生)。尝试从tensorflow网站更新tensorflow。我遇到了同样的问题,这就是原因。