我在装有GTX 970和4gb VRAM的装备上运行CNN。但是,我的代码到达tf.initialize_all_variables()
,它说它无法分配足够的内存。这是确切的行:
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 625.0KiB. See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:899] Internal: Dst tensor is not initialized.
E tensorflow/core/common_runtime/executor.cc:334] Executor failed to create kernel. Internal: Dst tensor is not initialized.
[[Node: zeros_30 = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [160000] values: 0 0 0...>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
正如你所看到的那样,它说不能分配625.0 KiB,970的4gb应该处理
如果有任何帮助,这是完整的日志:
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 970
major: 5 minor: 2 memoryClockRate (GHz) 1.3165
pciBusID 0000:01:00.0
Total memory: 3.94GiB
Free memory: 3.52GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (256): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (512): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1024): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2048): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4096): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8192): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16384): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (32768): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (65536): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (131072): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (262144): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (524288): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1048576): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2097152): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4194304): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8388608): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16777216): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (33554432): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (67108864): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (134217728): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (268435456): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:656] Bin for 625.0KiB was 512.0KiB, Chunk State:
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40100 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40200 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40500 of size 8192
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e42500 of size 16384
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e46500 of size 640000
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2a00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2b00 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2f00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee3000 of size 51200
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705eef800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705eef900 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f01900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f01a00 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f13a00 of size 18432
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f18200 of size 15884288
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x706e3e200 of size 8192
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x706e40200 of size 33554432
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x708e40200 of size 16384
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x708e44200 of size 3410411008
I tensorflow/core/common_runtime/bfc_allocator.cc:689] Summary of in-use Chunks by size:
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 10 Chunks of size 256 totalling 2.5KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 1024 totalling 1.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 8192 totalling 16.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 16384 totalling 32.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 18432 totalling 18.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 51200 totalling 50.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 73728 totalling 144.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 640000 totalling 625.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 15884288 totalling 15.15MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 33554432 totalling 32.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 3410411008 totalling 3.18GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] Sum Total of in-use chunks: 3.22GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats:
Limit: 3460759552
InUse: 3460759552
MaxInUse: 3460759552
NumAllocs: 23
MaxAllocSize: 3410411008
W tensorflow/core/common_runtime/bfc_allocator.cc:270] ******************************************************************************xxxxxxxxxxxxxxxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 625.0KiB. See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:899] Internal: Dst tensor is not initialized.
E tensorflow/core/common_runtime/executor.cc:334] Executor failed to create kernel. Internal: Dst tensor is not initialized.
[[Node: zeros_30 = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [160000] values: 0 0 0...>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
我一直试图解决这个问题,而且我已经为了自己的安慰而减少了CNN的尺寸。
另外,我正在运行我的970显示器......我是否需要将它们插入主板才能充分利用我的970?
谢谢!
答案 0 :(得分:0)
从您的日志:
Limit: 3460759552
InUse: 3460759552
MaxInUse: 3460759552
NumAllocs: 23
MaxAllocSize: 3410411008
您已经最大限度地利用了GPU的内存,您的模型太大而无法在该设备上进行处理。