Cuda用Keras Tensorflow内存不足

时间:2019-01-11 15:20:47

标签: python tensorflow keras gpu

我正在尝试对googleNet Inception Architecture进行分类。但是我收到了一个奇怪的内存不足错误。请帮助我。

2019-01-11 16:08:43.136845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2123 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 3GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-01-11 16:08:43.255541: W tensorflow/core/framework/allocator.cc:122] Allocation of 6281232384 exceeds 10% of system memory.
2019-01-11 16:08:51.609811: W tensorflow/core/framework/allocator.cc:122] Allocation of 6281232384 exceeds 10% of system memory.
2019-01-11 16:08:58.098275: W tensorflow/core/framework/allocator.cc:122] Allocation of 6281232384 exceeds 10% of system memory.
2019-01-11 16:09:02.965511: W tensorflow/core/framework/allocator.cc:122] Allocation of 6281232384 exceeds 10% of system memory.
2019-01-11 16:09:08.340921: W tensorflow/core/framework/allocator.cc:122] Allocation of 6281232384 exceeds 10% of system memory.
2019-01-11 16:10:00.857988: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 8589934592 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:00.881572: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 8589934592
2019-01-11 16:10:00.964002: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 7730940928 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:00.970572: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 7730940928
2019-01-11 16:10:01.038286: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 6957846528 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:01.044442: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 6957846528
2019-01-11 16:10:01.051304: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 8589934592 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:01.057950: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 8589934592
2019-01-11 16:10:11.065267: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 8589934592 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:11.101282: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 8589934592
2019-01-11 16:10:11.125436: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 8589934592 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-01-11 16:10:11.140062: W .\tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:44] could not allocate pinned host memory of size: 8589934592

张量流的分配摘要如下。

2019-01-11 16:10:11.153245: W tensorflow/core/common_runtime/bfc_allocator.cc:267] Allocator (cuda_host_bfc) ran out of memory trying to allocate 5.85GiB.  Current allocation summary follows.
2019-01-11 16:10:11.163437: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (256):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.179841: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (512):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.198029: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1024):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.211066: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.223903: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4096):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.233066: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.242287: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16384):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.251250: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (32768):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.267965: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (65536):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.277803: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (131072):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.286811: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (262144):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.298442: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (524288):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.308577: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1048576):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.317588: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.326986: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.335589: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8388608):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.348752: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.357372: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (33554432):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.366123: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (67108864):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.377628: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.387885: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (268435456):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-01-11 16:10:11.397278: I tensorflow/core/common_runtime/bfc_allocator.cc:613] Bin for 5.85GiB was 256.00MiB, Chunk State:

我正在使用NVIDIA GEFORCE GTX 1080 3GB和8GB RAM。 另外,有关数据的更多信息如下

Training Images Shape :- (16016, 224, 224, 3)      
Training Images Labels :- (16016, 163)
Testing Images Shape :- (14939, 224, 224, 3)
Testing Images Labels :- (14939, 163)

1 个答案:

答案 0 :(得分:0)

您的GPU最有可能用尽了内存。从本质上讲,这意味着您的数据大于内存可以容纳的范围。尝试减小批次大小,然后查看是否可行。