我在TensorFlow C ++上使用GPU吗?

时间:2018-06-25 19:54:50

标签: c++ tensorflow gpu

我使用CUDA_9.0和CuDNN_7.1构建用于C ++的TensorFlow,并加载mask-rcnn模型。日志记录如下(花费16秒钟!):

[Status] start loading model
2018-06-25 20:51:02.586593: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-06-25 20:51:02.587099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: GeForce GTX 880M major: 3 minor: 0 memoryClockRate(GHz): 0.993
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.46GiB
2018-06-25 20:51:02.587121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-06-25 20:51:02.861466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 20:51:02.861494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-06-25 20:51:02.861500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-06-25 20:51:02.861717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7216 MB memory) -> physical GPU (device: 0, name: GeForce GTX 880M, pci bus id: 0000:01:00.0, compute capability: 3.0)
[Status] load model sucess
2018-06-25 20:51:03.166828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-06-25 20:51:03.166864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 20:51:03.166872: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-06-25 20:51:03.166880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-06-25 20:51:03.167001: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7216 MB memory) -> physical GPU (device: 0, name: GeForce GTX 880M, pci bus id: 0000:01:00.0, compute capability: 3.0)
height:     636
width:      1024
channels:   3
2018-06-25 20:51:17.483748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-06-25 20:51:17.483799: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 20:51:17.483814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-06-25 20:51:17.483831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-06-25 20:51:17.483991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7216 MB memory) -> physical GPU (device: 0, name: GeForce GTX 880M, pci bus id: 0000:01:00.0, compute capability: 3.0)

==============================
detection_masks:0
==============================
Tensor<type: float shape: [6,636,1024] values: [[0 0 0]]...>

我不确定为什么推理时间这么长,所以我怀疑GPU没有参与...

有警告:

  

从SysFS读取的成功NUMA节点具有负值(-1),但是必须至少有一个NUMA节点,因此返回NUMA节点为零

这是花这么长时间的原因吗?

0 个答案:

没有答案