在Ubuntu Bionics上找不到Tensorflow GPU / device:GPU:0

时间:2018-08-10 11:30:31

标签: tensorflow cuda nvidia

我有一个来自Amazon EC2和Ubuntu Bionics的p3.2xlarge模板。应该安装了GPU设备。但是,当我运行这段代码时,它说没有GPU。现在,这是一个虚拟的而非物理的机器,但是仍然应该有一个GPU。请注意,我是使用Docket启动TensorFlow的,因为GPU丢失了,这应该不起作用:

@DeleteMapping("/employee/{id}")

我收到此错误:

sudo docker run -it -p 8888:8888 tensorflow/tensorflow


sess.close()
with tf.device('/device:CPU:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

但是已经加载了NVIDIA驱动程序,如果没有设备,您将无法加载这些驱动程序:

InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]

还有

lspci -nn | grep '\[03'
00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]
00:1e.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] [10de:1db1] (rev a1)

1 个答案:

答案 0 :(得分:-1)

我将暂时搁置它。 CUDA目前尚不支持Ubuntu Bionics。无论如何,我还是尝试使用最新版本的CUDA并得到以下错误:

cat /var/log/nvidia-installer.log
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Mon Aug 13 10:23:44 2018
installer version: 396.37

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
    ./nvidia-installer
    --ui=none
    --no-questions
    --accept-license
    --disable-nouveau
    --dkms

Using built-in stream user interface
-> Detected 8 CPUs online; setting concurrency level to 8.
-> Installing NVIDIA driver version 396.37.
-> For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory.  Would you like nvidia-installer to attempt to create this modprobe file for you? (Answer: Yes)
-> One or more modprobe configuration files to disable Nouveau have been written.  For some distributions, this may be sufficient to disable Nouveau; other distributions may require modification of the initial ramdisk.  Please reboot your system and attempt NVIDIA driver installation again.  Note if you later wish to reenable Nouveau, you will need to delete these files: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
-> Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. (Answer: Yes)
-> Installing both new and classic TLS OpenGL libraries.
-> Installing both new and classic TLS 32bit OpenGL libraries.
-> Install NVIDIA's 32-bit compatibility libraries? (Answer: Yes)
-> Will install GLVND GLX client libraries.
-> Will install GLVND EGL client libraries.
-> Skipping GLX non-GLVND file: "libGL.so.396.37"
-> Skipping GLX non-GLVND file: "libGL.so.1"
-> Skipping GLX non-GLVND file: "libGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.396.37"
-> Skipping EGL non-GLVND file: "libEGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.1"
-> Skipping GLX non-GLVND file: "./32/libGL.so.396.37"
-> Skipping GLX non-GLVND file: "libGL.so.1"
-> Skipping GLX non-GLVND file: "libGL.so"
-> Skipping EGL non-GLVND file: "./32/libEGL.so.396.37"
-> Skipping EGL non-GLVND file: "libEGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.1"
Looking for install checker script at ./libglvnd_install_checker/check-libglvnd-install.sh
   executing: '/bin/sh ./libglvnd_install_checker/check-libglvnd-install.sh'...
   Checking for libglvnd installation.
   Checking libGLdispatch...
   Checking libGLdispatch dispatch table
   Checking call through libGLdispatch
   All OK
   libGLdispatch is OK
   Checking for libGLX
   libGLX is OK
   Checking for libEGL
   eglGetDisplay failed
   Checking entrypoint library libOpenGL.so.0
   Checking call through libGLdispatch
   Checking call through library libOpenGL.so.0
   dlopen("libOpenGL.so.0") failed: libOpenGL.so.0: cannot open shared object file: No such file or directory
   Checking entrypoint library libGL.so.1
   Checking call through libGLdispatch
   Checking call through library libGL.so.1
   glGetString was not called

   Found libglvnd libraries: libGLX.so.0 libGLdispatch.so.0 
   Missing libglvnd libraries: libGL.so.1 libOpenGL.so.0 libEGL.so.1 

-> An incomplete installation of libglvnd was found. Do you want to install a full copy of libglvnd? This will overwrite any existing libglvnd libraries. (Answer: Abort installation.)
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.