GPU内存足够大,但是tensorflow深度学习模型仍然用尽内存

时间:2019-11-24 04:18:55

标签: tensorflow gpu

  

我们的实验室仅安装了一个具有约12个GPU插槽的GPU服务器。为了散热,我们每两个插槽安装了6个TITAN RTX GPU。

TITAN RTX具有24G内存。 对于我们的tensorflow深层学习模型训练来说,这足够大。 但是仍然内存不足。 主要的运行日志文件如下:

能给我些帮助吗?预先谢谢大家!

nohup: ignoring input
/home/xingyg/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
WARNING:tensorflow:From /export/disk3/xyg/xinglib/function.py:14: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
/export/disk3/xyg/xinglib/function.py:89: RuntimeWarning: invalid value encountered in less_equal
  label = (label <= cutoff).astype(float)
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:366: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:281: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:46: separable_conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.separable_conv2d instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-11-24 11:36:16.653546: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-24 11:36:18.338942: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558b8aeb8a70 executing computations on platform CUDA. Devices:
2019-11-24 11:36:18.339018: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339036: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339051: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (2): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339064: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (3): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339077: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (4): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339090: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (5): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.365581: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2100015000 Hz
2019-11-24 11:36:18.368617: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558b8afb9d60 executing computations on platform Host. Devices:
2019-11-24 11:36:18.368698: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-11-24 11:36:18.369120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:04:00.0
totalMemory: 23.65GiB freeMemory: 23.48GiB
2019-11-24 11:36:18.369174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-11-24 11:36:18.385828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-24 11:36:18.385853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-11-24 11:36:18.385864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-11-24 11:36:18.386000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22844 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci
 bus id: 0000:04:00.0, compute capability: 7.5)
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a 
future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-11-24 11:40:02.350540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: 
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:05:00.0
totalMemory: 23.65GiB freeMemory: 23.48GiB
2019-11-24 11:40:02.350693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-11-24 11:40:02.350802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-24 11:40:02.350813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 
2019-11-24 11:40:02.350819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N 
2019-11-24 11:40:02.350825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N 
2019-11-24 11:40:02.350923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22844 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci
 bus id: 0000:04:00.0, compute capability: 7.5)
2019-11-24 11:40:02.351154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22844 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci
 bus id: 0000:05:00.0, compute capability: 7.5)
2019-11-24 11:42:00.507478: W tensorflow/core/common_runtime/bfc_allocator.cc:267] Allocator (GPU_0_bfc) ran out of memory trying to allocate 120.88MiB.  Current allocation summary follows.
2019-11-24 11:42:00.507889: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (256):   Total Chunks: 657, Chunks in use: 654. 164.2KiB allocated for chunks. 163.5KiB in use in bin. 2.6KiB client-requested in use in bin
.
2019-11-24 11:42:00.507921: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (512):   Total Chunks: 1619, Chunks in use: 1616. 813.0KiB allocated for chunks. 811.0KiB in use in bin. 631.2KiB client-requested in use in
 bin.
2019-11-24 11:42:00.507940: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1024):  Total Chunks: 6, Chunks in use: 5. 7.2KiB allocated for chunks. 6.2KiB in use in bin. 5.7KiB client-requested in use in bin.
2019-11-24 11:42:00.507957: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.507973: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4096):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.507989: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508015: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16384):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508033: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (32768):         Total Chunks: 55, Chunks in use: 55. 3.06MiB allocated for chunks. 3.06MiB in use in bin. 3.05MiB client-requested in use i
n bin.
2019-11-24 11:42:00.508053: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (65536):         Total Chunks: 332, Chunks in use: 331. 40.61MiB allocated for chunks. 40.51MiB in use in bin. 40.48MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508072: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (131072):        Total Chunks: 103, Chunks in use: 103. 20.65MiB allocated for chunks. 20.65MiB in use in bin. 20.40MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508091: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (262144):        Total Chunks: 104, Chunks in use: 104. 28.53MiB allocated for chunks. 28.53MiB in use in bin. 27.79MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508111: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (524288):        Total Chunks: 175, Chunks in use: 174. 122.76MiB allocated for chunks. 122.02MiB in use in bin. 121.99MiB client-requested 
in use in bin.
2019-11-24 11:42:00.508130: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1048576):       Total Chunks: 29, Chunks in use: 29. 30.61MiB allocated for chunks. 30.61MiB in use in bin. 29.18MiB client-requested in us
e in bin.
2019-11-24 11:42:00.508147: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508163: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508179: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8388608):       Total Chunks: 1, Chunks in use: 0. 8.17MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508195: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508212: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (33554432):      Total Chunks: 200, Chunks in use: 200. 7.93GiB allocated for chunks. 7.93GiB in use in bin. 7.88GiB client-requested in use
 in bin.
2019-11-24 11:42:00.508230: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (67108864):      Total Chunks: 103, Chunks in use: 100. 10.87GiB allocated for chunks. 10.63GiB in use in bin. 9.45GiB client-requested in u
se in bin.
2019-11-24 11:42:00.508248: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (134217728):     Total Chunks: 15, Chunks in use: 15. 2.93GiB allocated for chunks. 2.93GiB in use in bin. 2.25GiB client-requested in use i
n bin.
2019-11-24 11:42:00.508268: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (268435456):     Total Chunks: 1, Chunks in use: 1. 334.17MiB allocated for chunks. 334.17MiB in use in bin. 241.76MiB client-requested in u
se in bin.
2019-11-24 11:42:00.508286: I tensorflow/core/common_runtime/bfc_allocator.cc:613] Bin for 120.88MiB was 64.00MiB, Chunk State: 
2019-11-24 11:42:00.508317: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.58MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 40.29MiB | Requested Size: 40.29MiB | in_use: 1, next:   Size: 120.88MiB 
| Requested Size: 120.88MiB | in_use: 1
2019-11-24 11:42:00.508343: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.59MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 40.29MiB | Requested Size: 40.29MiB | in_use: 1, next:   Size: 44.04MiB |
 Requested Size: 40.29MiB | in_use: 1
2019-11-24 11:42:00.508373: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.59MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 120.88MiB | Requested Size: 120.88MiB | in_use: 1, next:   Size: 40.29MiB
 | Requested Size: 40.29MiB | in_use: 1
2019-11-24 11:42:00.508392: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3cc000000 of size 126750208
2019-11-24 11:42:00.508406: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3d38e0e00 of size 126750208
2019-11-24 11:42:00.508418: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3db1c1c00 of size 126750208
2019-11-24 11:42:00.508431: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3e2aa2a00 of size 126750208
2019-11-24 11:42:00.508443: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3ea383800 of size 126750208
2019-11-24 11:42:00.508456: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f1c64600 of size 42250240
2019-11-24 11:42:00.508469: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f44af600 of size 42250240
2019-11-24 11:42:00.508481: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f6cfa600 of size 42250240
2019-11-24 11:42:00.508493: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f9545600 of size 42250240
2019-11-24 11:42:00.508506: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3fbd90600 of size 42250240
2019-11-24 11:42:00.508519: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3fe5db600 of size 211251200
nohup: ignoring input
/home/xingyg/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 ==
 np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
WARNING:tensorflow:From /export/disk3/xyg/xinglib/function.py:14: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
/export/disk3/xyg/xinglib/function.py:89: RuntimeWarning: invalid value encountered in less_equal
  label = (label <= cutoff).astype(float)
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:366: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future versi
on.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:281: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a futur
e version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /export/disk3/xyg/autoML/Web/TFweb.py:46: separable_conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.separable_conv2d instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-11-24 11:36:16.653546: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-24 11:36:18.338942: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558b8aeb8a70 executing computations on platform CUDA. Devices:
2019-11-24 11:36:18.339018: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339036: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339051: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (2): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339064: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (3): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339077: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (4): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.339090: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (5): TITAN RTX, Compute Capability 7.5
2019-11-24 11:36:18.365581: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2100015000 Hz
2019-11-24 11:36:18.368617: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558b8afb9d60 executing computations on platform Host. Devices:
2019-11-24 11:36:18.368698: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-11-24 11:36:18.369120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:04:00.0
totalMemory: 23.65GiB freeMemory: 23.48GiB
2019-11-24 11:36:18.369174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-11-24 11:36:18.385828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-24 11:36:18.385853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-11-24 11:36:18.385864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-11-24 11:36:18.386000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22844 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci
 bus id: 0000:04:00.0, compute capability: 7.5)
WARNING:tensorflow:From /home/xingyg/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a 
future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-11-24 11:40:02.350540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties: 
name: TITAN RTX major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:05:00.0
totalMemory: 23.65GiB freeMemory: 23.48GiB
2019-11-24 11:40:02.350693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-11-24 11:40:02.350802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-24 11:40:02.350813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 
2019-11-24 11:40:02.350819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N N 
2019-11-24 11:40:02.350825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   N N 
2019-11-24 11:40:02.350923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22844 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci
 bus id: 0000:04:00.0, compute capability: 7.5)
2019-11-24 11:40:02.351154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22844 MB memory) -> physical GPU (device: 1, name: TITAN RTX, pci
 bus id: 0000:05:00.0, compute capability: 7.5)
2019-11-24 11:42:00.507478: W tensorflow/core/common_runtime/bfc_allocator.cc:267] Allocator (GPU_0_bfc) ran out of memory trying to allocate 120.88MiB.  Current allocation summary follows.
2019-11-24 11:42:00.507889: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (256):   Total Chunks: 657, Chunks in use: 654. 164.2KiB allocated for chunks. 163.5KiB in use in bin. 2.6KiB client-requested in use in bin
.
2019-11-24 11:42:00.507921: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (512):   Total Chunks: 1619, Chunks in use: 1616. 813.0KiB allocated for chunks. 811.0KiB in use in bin. 631.2KiB client-requested in use in
 bin.
2019-11-24 11:42:00.507940: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1024):  Total Chunks: 6, Chunks in use: 5. 7.2KiB allocated for chunks. 6.2KiB in use in bin. 5.7KiB client-requested in use in bin.
2019-11-24 11:42:00.507957: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.507973: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4096):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.507989: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508015: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16384):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508033: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (32768):         Total Chunks: 55, Chunks in use: 55. 3.06MiB allocated for chunks. 3.06MiB in use in bin. 3.05MiB client-requested in use i
n bin.
2019-11-24 11:42:00.508053: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (65536):         Total Chunks: 332, Chunks in use: 331. 40.61MiB allocated for chunks. 40.51MiB in use in bin. 40.48MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508072: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (131072):        Total Chunks: 103, Chunks in use: 103. 20.65MiB allocated for chunks. 20.65MiB in use in bin. 20.40MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508091: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (262144):        Total Chunks: 104, Chunks in use: 104. 28.53MiB allocated for chunks. 28.53MiB in use in bin. 27.79MiB client-requested in 
use in bin.
2019-11-24 11:42:00.508111: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (524288):        Total Chunks: 175, Chunks in use: 174. 122.76MiB allocated for chunks. 122.02MiB in use in bin. 121.99MiB client-requested 
in use in bin.
2019-11-24 11:42:00.508130: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1048576):       Total Chunks: 29, Chunks in use: 29. 30.61MiB allocated for chunks. 30.61MiB in use in bin. 29.18MiB client-requested in us
e in bin.
2019-11-24 11:42:00.508147: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508163: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508179: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8388608):       Total Chunks: 1, Chunks in use: 0. 8.17MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508195: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-11-24 11:42:00.508212: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (33554432):      Total Chunks: 200, Chunks in use: 200. 7.93GiB allocated for chunks. 7.93GiB in use in bin. 7.88GiB client-requested in use
 in bin.
2019-11-24 11:42:00.508230: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (67108864):      Total Chunks: 103, Chunks in use: 100. 10.87GiB allocated for chunks. 10.63GiB in use in bin. 9.45GiB client-requested in u
se in bin.
2019-11-24 11:42:00.508248: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (134217728):     Total Chunks: 15, Chunks in use: 15. 2.93GiB allocated for chunks. 2.93GiB in use in bin. 2.25GiB client-requested in use i
n bin.
2019-11-24 11:42:00.508268: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (268435456):     Total Chunks: 1, Chunks in use: 1. 334.17MiB allocated for chunks. 334.17MiB in use in bin. 241.76MiB client-requested in u
se in bin.
2019-11-24 11:42:00.508286: I tensorflow/core/common_runtime/bfc_allocator.cc:613] Bin for 120.88MiB was 64.00MiB, Chunk State: 
2019-11-24 11:42:00.508317: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.58MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 40.29MiB | Requested Size: 40.29MiB | in_use: 1, next:   Size: 120.88MiB 
| Requested Size: 120.88MiB | in_use: 1
2019-11-24 11:42:00.508343: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.59MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 40.29MiB | Requested Size: 40.29MiB | in_use: 1, next:   Size: 44.04MiB |
 Requested Size: 40.29MiB | in_use: 1
2019-11-24 11:42:00.508373: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 80.59MiB | Requested Size: 40.29MiB | in_use: 0, prev:   Size: 120.88MiB | Requested Size: 120.88MiB | in_use: 1, next:   Size: 40.29MiB
 | Requested Size: 40.29MiB | in_use: 1
2019-11-24 11:42:00.508392: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3cc000000 of size 126750208
2019-11-24 11:42:00.508406: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3d38e0e00 of size 126750208
2019-11-24 11:42:00.508418: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3db1c1c00 of size 126750208
2019-11-24 11:42:00.508431: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3e2aa2a00 of size 126750208
2019-11-24 11:42:00.508443: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3ea383800 of size 126750208
2019-11-24 11:42:00.508456: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f1c64600 of size 42250240
2019-11-24 11:42:00.508469: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f44af600 of size 42250240
2019-11-24 11:42:00.508481: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f6cfa600 of size 42250240
2019-11-24 11:42:00.508493: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3f9545600 of size 42250240
2019-11-24 11:42:00.508506: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7fd3fbd90600 of size 42250240
  

0 个答案:

没有答案