我正在使用两张图形卡和4GB的GeForce gtx980,我计算我的神经网络总是从粘贴的shell输出的最后一行的0到99%和99%到0%(重复)跳跃。
大约90秒后,它进行了第一次计算。我将我的图像一个接一个地放入神经网络(for-loop)。以下计算仅需要20秒(3个时期),GPU跳跃在96%到100%之间。
为什么它在开始时跳跃?
我使用旗帜:
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
我可以确定它真的使用的不是比nvidia-smi -lms 50
显示的那么多的兆字节吗?
2017-08-10 16:33:24.836084: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 16:33:24.836100: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 16:33:25.052501: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-08-10 16:33:25.052861: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX 980
major: 5 minor: 2 memoryClockRate (GHz) 1.2155
pciBusID 0000:03:00.0
Total memory: 3.94GiB
Free memory: 3.87GiB
2017-08-10 16:33:25.187760: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x8532640 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-10 16:33:25.188006: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-08-10 16:33:25.188291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 1 with properties:
name: GeForce GT 730
major: 3 minor: 5 memoryClockRate (GHz) 0.9015
pciBusID 0000:02:00.0
Total memory: 1.95GiB
Free memory: 1.45GiB
2017-08-10 16:33:25.188312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 0 and 1
2017-08-10 16:33:25.188319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:832] Peer access not supported between device ordinals 1 and 0
2017-08-10 16:33:25.188329: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 1
2017-08-10 16:33:25.188335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y N
2017-08-10 16:33:25.188339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1: N Y
2017-08-10 16:33:25.188348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980, pci bus id: 0000:03:00.0)
Epoche: 0001 cost= 0.620101001 time= 115.366318226
Epoche: 0004 cost= 0.335480299 time= 19.4528050423