在GCE(Google Compute Engine)中,
为了检查GPU数量如何影响计算速度(在这种情况下为tf.nn.conv2d
),我在1个GPU和4个GPU VM环境上运行了以下代码。
我想使用4个GPU会比使用1个GPU快4倍。
但是要比较输出(1)和输出(2),它的速度不如我想象的快。
我想念什么吗?
有什么方法可以通过增加GPU的数量来加快计算速度。
代码:
# How to run:
# $ ipython3 this.py > stdout.txt
import numpy as np
import tensorflow as tf
# For using IPython %timeit command.
from IPython import get_ipython
ipython = get_ipython()
# Define 2d matrix width and height.
width, height = 1000, 1000
# Define 2d matrix like below.
#
# | | 1 1 1 ... 1 | |
# | | 1 . ..... 1 | |
# | | ........... | |
# | | 1 1 1 ... 1 | |
#
arr1_N = 1
arr1 = np.ones((arr1_N,height,width), dtype=np.float32)
arr1_on_channel = arr1.reshape((arr1_N, height, width, 1))
# Define many 2d matrices like below.
#
# | | 1 1 1 ... 1 | | 1 1 1 ... 1 | | 1 1 1 ... 1 | |
# | | 1 . ..... 1 | | 1 . ..... 1 | | 1 . ..... 1 | |
# | | ........... | | ........... | | ........... | |
# | | 1 1 1 ... 1 |, | 1 1 1 ... 1 | ,,, | 1 1 1 ... 1 | |
#
arr2_N = 100
arr2 = np.ones((arr2_N,height,width), dtype=np.float32)
arr2_on_channel = arr2.reshape((arr2_N, height, width, 1))
# Define filter matrix for convolution.
# Filter shape must be (filter_height, filter_width, in_channels, out_channels).
# See -> `https://www.tensorflow.org/api_docs/python/tf/nn/conv2d`.
fltr = np.ones(((3,3,1,1)))
# Define placeholder to take 2d matrices for input of convolution calculation.
inpt = tf.placeholder(tf.float32, shape=(None,height,width,1))
# Define operation for calculating convolution.
strides = (1,1,1,1)
convops = tf.nn.conv2d(inpt, fltr, strides, 'SAME')
# The graph has defined in the above.
# Now time to define function to do calculation of convolution.
def calc_conv(arr):
with tf.Session() as sess:
out = sess.run(convops, feed_dict={inpt: arr})
return out
# Do clalculation of arr1.
print("----(N, width, height)=(%d, %d, %d)----"%(arr1_N, width, height))
ipy_cmd = "timeit calc_conv(arr1_on_channel)"
print(ipy_cmd)
ipython.magic(ipy_cmd)
print()
# Do clalculation of arr2.
print("----(N, width, height)=(%d, %d, %d)----"%(arr2_N, width, height))
ipy_cmd = "timeit calc_conv(arr2_on_channel)"
print(ipy_cmd)
ipython.magic(ipy_cmd)
print()
我将VM设置自定义为使用1个GPU。 并运行上面的脚本。 这是标准输出的输出。
输出(1)-1个GPU:
----(N, width, height)=(1, 1000, 1000)----
timeit calc_conv(arr1_on_channel)
5.19 ms ± 237 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
----(N, width, height)=(100, 1000, 1000)----
timeit calc_conv(arr2_on_channel)
162 ms ± 655 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
然后我自定义VM设置以使用4个GPU。 并运行脚本。但是结果是,它没有我想象的那么快。 这是标准输出的输出。
输出(2)-4个GPU:
----(N, width, height)=(1, 1000, 1000)----
timeit calc_conv(arr1_on_channel)
8.74 ms ± 663 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
----(N, width, height)=(100, 1000, 1000)----
timeit calc_conv(arr2_on_channel)
154 ms ± 10 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
这是我使用的VM映像:
Intel® optimized Deep Learning Image: TensorFlow 1.12.0 m13 (with Intel® MKL-DNN/MKL and CUDA 10.0)
A Debian based image with TensorFlow (With CUDA 10.0 and Intel® MKL-DNN, Intel® MKL) plus Intel® optimized NumPy, SciPy, and scikit-learn.
这是当我使用4个GPU运行脚本时的日志输出:
$ ipython3 convolve2d_tensorflow_simple.py > simple_out_1000x1000_p100x4.txt
2018-12-07 16:36:51.914542: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-07 16:36:53.349588: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-07 16:36:53.350163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
pciBusID: 0000:00:04.0
totalMemory: 15.90GiB freeMemory: 15.61GiB
2018-12-07 16:36:53.484136: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-07 16:36:53.484710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties:
name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
pciBusID: 0000:00:05.0
totalMemory: 15.90GiB freeMemory: 15.61GiB
2018-12-07 16:36:53.629408: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-07 16:36:53.629992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 2 with properties:
name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
pciBusID: 0000:00:06.0
totalMemory: 15.90GiB freeMemory: 15.61GiB
2018-12-07 16:36:53.772568: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-07 16:36:53.773188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 3 with properties:
name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
pciBusID: 0000:00:07.0
totalMemory: 15.90GiB freeMemory: 15.61GiB
2018-12-07 16:36:53.776050: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:55.136658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:55.136721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:55.136729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:55.136733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:55.136736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:55.136740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:55.137692: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:55.138322: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:55.138717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:55.139065: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.327153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.327321: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.327338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.327343: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.327347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.327354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.327362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.328156: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.328381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.328652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.328847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.336493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.336621: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.336639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.336644: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.336647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.336662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.336671: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.337342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.337481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.337630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.337754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.344687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.344806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.344813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.344828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.344834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.344837: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.344844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.345529: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.345701: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.345870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.346015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.353019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.353118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.353125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.353129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.353132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.353135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.353139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.353805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.353941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.354107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.354270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.361134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.361233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.361240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.361244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.361257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.361269: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.361285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.361938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.362085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.362260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.362412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.370310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.370442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.370460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.370465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.370468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.370472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.370488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.371162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.371300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.371455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.371582: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.378548: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.378727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.378748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.378753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.378781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.378786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.378790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.379497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.379625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.380007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.380141: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:56.389387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:56.389497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:56.389504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:56.389516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:56.389520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:56.389524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:56.389528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:56.390242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:56.390410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:56.390623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:56.390743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:57.186091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:57.186276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:57.186291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:57.186297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:57.186301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:57.186305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:57.186319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:57.187149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:57.187347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:57.187524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:57.187687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-12-07 16:36:57.338152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-07 16:36:57.338309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:57.338342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:57.338347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:57.338350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:57.338355: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:57.338358: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:57.339170: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:57.339390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:57.339552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:57.339744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
.
.
.
2018-12-07 16:36:58.117850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-07 16:36:58.117871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2018-12-07 16:36:58.117875: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N Y N N
2018-12-07 16:36:58.117879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: Y N N N
2018-12-07 16:36:58.117883: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N Y
2018-12-07 16:36:58.117892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N Y N
2018-12-07 16:36:58.118642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
2018-12-07 16:36:58.118823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15129 MB memory) -> physical GPU (device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0)
2018-12-07 16:36:58.118983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 15129 MB memory) -> physical GPU (device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0)
2018-12-07 16:36:58.119115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 15129 MB memory) -> physical GPU (device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
谢谢。