我有一个带有深度学习AMI的AWS g3.8xlarge实例。由于某些原因,它无法正确找到GPU。
此命令显示有2个GPU:
$ lspci | grep -i nvidia
00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
但是,似乎没有加载驱动程序...
$ cat /proc/driver/nvidia/version
cat: /proc/driver/nvidia/version: No such file or directory
而且,运行deviceQuery示例程序会显示失败:
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL