Nvidia-Docker run和exec没有相同的文件?

时间:2018-06-19 14:25:29

标签: docker tensorflow nvidia-docker

这完全让我震惊。我一直在尝试在docker中执行GPU加速的应用程序,但通常会遇到丢失的libcuda.so.1错误。在进行故障排除时,我发现了这一点。

sudo nvidia-docker run --rm nvidia/cuda:9.0-devel nvidia-smi

给......

Tue Jun 19 14:21:16 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:41:00.0 Off |                  N/A |
|  0%   31C    P0    26W / 250W |      0MiB / 11169MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

...但是如果我用相同的图像构建一个容器...

FROM nvidia/cuda:9.0-devel

RUN apt-get update && apt-get install -y python3 python3-dev python3-pip python3-cffi libcairo2-dev python-cairo python3-tk

RUN pip3 install cairocffi editdistance numpy scipy matplotlib keras tensorflow-gpu

ENTRYPOINT ["tail", "-f", "/dev/null"]

并尝试运行nvidia-smi,它不存在。

sudo nvidia-docker exec 5961ce38b1ef nvidia-smi
OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown

在此示例中,我已检查以确保容器ID是正确的。如果我实际上进入容器并运行命令,也会发生同样的事情。

如何使tensorflow-gpu在容器中工作?

0 个答案:

没有答案