Question

（没有愚蠢的问题，但很多好奇的白痴[我]很抱歉，如果答案很明显）。我有一个mac，我想在gpu上运行张量流程序。可悲的是，我的电脑没有必要的gpu，所以我希望使用一个亚马逊。我成功创建了一个p2.xlarge实例，但我不知道如何从我的程序中实际访问该gpu。该程序如何知道使用amazon gpu？我需要包含一些代码吗？这是我目前拥有的代码，用于本地gpus：

类DeviceCellWrapper（tf.contrib.rnn.GRUCell）： def init （自我，设备，单元格）： self._cell = cell self._device = device

@property
def state_size(self):
    return self._cell.state_size

@property
def output_size(self):
    return self._cell.output_size

def __call__(self,inputs,state,scope=None):
    with tf.device(self._device):
        return self._cell(inputs,state,scope)

devices = [“/ gpu：0]

那么我应该用“

”替换“/ gpu：0”

Answer 1

您应该使用命令nvidia-smi来查看GPU上正在运行的进程。你会看到这样的事情：

$ nvidia-smi
Wed May 17 15:59:12 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.55                 Driver Version: 367.55                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:07:00.0     Off |                  N/A |
| 27%   33C    P0    40W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:08:00.0     Off |                  N/A |
| 27%   36C    P0    39W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1080    Off  | 0000:0B:00.0     Off |                  N/A |
| 27%   31C    P0    40W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 1080    Off  | 0000:0C:00.0     Off |                  N/A |
| 27%   32C    P0    39W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 1080    Off  | 0000:85:00.0     Off |                  N/A |
| 27%   33C    P0    40W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 1080    Off  | 0000:86:00.0     Off |                  N/A |
| 27%   31C    P0    40W / 180W |      0MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 1080    Off  | 0000:8D:00.0     Off |                  N/A |
| 27%   32C    P0    39W / 180W |      0MiB /  8113MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

您应该设置环境变量CUDA_VISIBLE_DEVICES=[gpu_id[,gpu_id]]以限制特定的python进程仅查看该GPU（如果您想在不同的GPU上运行多个应用程序，则非常有用）。

如果未设置CUDA_VISIBLE_DEVICES，则tensorflow将消耗所有GPU上的内存。默认情况下，如果可用的话，你将使用一个GPU，所以除非你准备好转向分布式培训（并且因为你问这个问题，你还没有准备好），你应该使用CUDA_VISIBLE_DEVICES。

当你启动tensorflow脚本时，你会看到类似这样的内容，表明它正确加载了GPU驱动程序：

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally

如何在亚马逊gpu上运行张量流程序？

1 个答案: