(没有愚蠢的问题,但很多好奇的白痴[我]很抱歉,如果答案很明显)。我有一个mac,我想在gpu上运行张量流程序。可悲的是,我的电脑没有必要的gpu,所以我希望使用一个亚马逊。我成功创建了一个p2.xlarge实例,但我不知道如何从我的程序中实际访问该gpu。该程序如何知道使用amazon gpu?我需要包含一些代码吗?这是我目前拥有的代码,用于本地gpus:
类DeviceCellWrapper(tf.contrib.rnn.GRUCell): def init (自我,设备,单元格): self._cell = cell self._device = device
@property
def state_size(self):
return self._cell.state_size
@property
def output_size(self):
return self._cell.output_size
def __call__(self,inputs,state,scope=None):
with tf.device(self._device):
return self._cell(inputs,state,scope)
devices = [“/ gpu:0]
那么我应该用“
”替换“/ gpu:0”答案 0 :(得分:1)
您应该使用命令nvidia-smi
来查看GPU上正在运行的进程。你会看到这样的事情:
$ nvidia-smi
Wed May 17 15:59:12 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.55 Driver Version: 367.55 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:07:00.0 Off | N/A |
| 27% 33C P0 40W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1080 Off | 0000:08:00.0 Off | N/A |
| 27% 36C P0 39W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 1080 Off | 0000:0B:00.0 Off | N/A |
| 27% 31C P0 40W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 1080 Off | 0000:0C:00.0 Off | N/A |
| 27% 32C P0 39W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX 1080 Off | 0000:85:00.0 Off | N/A |
| 27% 33C P0 40W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX 1080 Off | 0000:86:00.0 Off | N/A |
| 27% 31C P0 40W / 180W | 0MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX 1080 Off | 0000:8D:00.0 Off | N/A |
| 27% 32C P0 39W / 180W | 0MiB / 8113MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
您应该设置环境变量CUDA_VISIBLE_DEVICES=[gpu_id[,gpu_id]]
以限制特定的python进程仅查看该GPU(如果您想在不同的GPU上运行多个应用程序,则非常有用)。
如果未设置CUDA_VISIBLE_DEVICES
,则tensorflow将消耗所有GPU上的内存。默认情况下,如果可用的话,你将使用一个GPU,所以除非你准备好转向分布式培训(并且因为你问这个问题,你还没有准备好),你应该使用CUDA_VISIBLE_DEVICES
。
当你启动tensorflow脚本时,你会看到类似这样的内容,表明它正确加载了GPU驱动程序:
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally