
时间:2016-12-22 08:42:51

标签: tensorflow




net_output = create_graph()

sess1 = tf.Session()

batch_size = 64
sess1.run(net_output, {net_input_img: np.random.rand(batch_size, 256, 256, 3)})


W tensorflow/core/common_runtime/bfc_allocator.cc:217] Ran out of memory trying to allocate 1.51GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.

尽管sess1太大,但batch_size 64似乎可以batch_size 64运行。


# assume 5 networks have the same structure
sess2 = tf.Session()
sess3 = tf.Session()
sess4 = tf.Session()
sess5 = tf.Session()


batch_size = 64
sess1.run(net_output, {net_input_img: np.random.rand(batch_size, 256, 256, 3)})


ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape...



sess1.run(net_output, {net_input_img: np.random.rand(batch_size, 256, 256, 3)})



  • Session.close是否会释放GPU内存?启动和关闭sess1.run后,为什么sess[2-5]无法转发?

  • 有没有更好的方法在带GPU的服务器上部署多个网络?


gpu_options = tf.ConfigProto(gpu_options=tf.GPUOptions(
sess = tf.Session(config=gpu_options)




import tensorflow as tf
import tensorflow.contrib.slim as slim

def create_graph(x):
    dummy = tf.zeros([100, 100, 100, 5]) 
    return slim.repeat(x, 3, slim.conv2d, 87, [5, 5])

batch_size = 64
x = tf.zeros([batch_size, 256, 256, 3])
output = create_graph(x)

sess1 = tf.Session()

num_other_sessions = 50
other_sessions = []
for _ in range(num_other_sessions):
    sess = tf.Session()

except Exception as e:

for sess in other_sessions:

# If I run the following two lines, the bottom sess1.run(output) could be run without error.
# del sess
# del other_sessions

except Exception as e:


W tensorflow/core/common_runtime/bfc_allocator.cc:217] Ran out of memory trying to allocate 124.62MiB.



I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 980 Ti
major: 5 minor: 2 memoryClockRate (GHz) 1.228
pciBusID 0000:03:00.0
Total memory: 5.93GiB
Free memory: 5.84GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:03:00.0)
W tensorflow/core/common_runtime/bfc_allocator.cc:217] Ran out of memory trying to allocate 124.62MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:03:00.0)


I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (256):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (512):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1024):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2048):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4096):  Total Chunks: 1, Chunks in use: 0 7.0KiB allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8192):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16384):     Total Chunks: 1, Chunks in use: 0 25.5KiB allocated for chunks. 25.5KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (32768):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (65536):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (131072):    Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (262144):    Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (524288):    Total Chunks: 1, Chunks in use: 0 586.2KiB allocated for chunks. 384.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1048576):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2097152):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4194304):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8388608):   Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16777216):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (33554432):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (67108864):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (134217728):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (268435456):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:660] Bin for 739.2KiB was 512.0KiB, Chunk State: 
I tensorflow/core/common_runtime/bfc_allocator.cc:666]   Size: 586.2KiB | Requested Size: 384.0KiB | in_use: 0, prev:   Size: 25.5KiB | Requested Size: 25.5KiB | in_use: 1, next:   Size: 739.2KiB | Requested Size: 739.2KiB | in_use: 1
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780000 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780600 of size 512
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780900 of size 512
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780b00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780c00 of size 512
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780e00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309780f00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309781000 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309781500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309781600 of size 512
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1309781800 of size 256


I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x130eb8c100 of size 7168
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1310cf5a00 of size 26112
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1310d0f200 of size 600320
I tensorflow/core/common_runtime/bfc_allocator.cc:693]      Summary of in-use Chunks by size: 
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 255 Chunks of size 256 totalling 63.8KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 303 Chunks of size 512 totalling 151.5KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 3 Chunks of size 768 totalling 2.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 48 Chunks of size 1280 totalling 60.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 3 Chunks of size 1536 totalling 4.5KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 49 Chunks of size 26112 totalling 1.22MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 49920 totalling 48.8KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 50944 totalling 49.8KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 98 Chunks of size 756992 totalling 70.75MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 783104 totalling 2.99MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 50331648 totalling 48.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 1459617792 totalling 2.72GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 2904102656 totalling 2.70GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:700] Sum Total of in-use chunks: 5.54GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:702] Stats: 
Limit:                  5953290240
InUse:                  5952656640
MaxInUse:               5953264128
NumAllocs:                    1259
MaxAllocSize:           2904102656

W tensorflow/core/common_runtime/bfc_allocator.cc:274] ****************************************************************************xxxxxxxxxxxxxxxxxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 739.2KiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:975] Resource exhausted: OOM when allocating tensor with shape[87,87,5,5]

1 个答案:

答案 0 :(得分:1)

