以下问题不是关于如何config fraction of gpu memory used。
CPU:
FixedLengthRecordReaderV2 allocation_description {requested_bytes:64 allocated_bytes:64 allocator_name:“cpu”allocation_id:107996
GPU:
重塑/形状“tensor {dtype:DT_INT32 shape {dim {size:1}} allocation_description {requested_bytes:4 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:329 ptr:1112161657600 }}
“未知”张量{dtype:DT_UINT8 shape {dim {size:3073}} allocation_description {requested_bytes:3073 allocated_bytes:3328 allocator_name:“gpu_bfc”allocation_id:152161 has_single_reference: true ptr:1108327235584}}}
重塑/形状“tensor {dtype:DT_INT32 shape {dim {size:1}} allocation_description {requested_bytes:4 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:329 ptr:1112161657600 }}
DecodeRaw“tensor {dtype:DT_UINT8 shape {dim {size:3073}} allocation_description {requested_bytes:3073 allocated_bytes:4864 allocator_name:“cuda_host_bfc”allocation_id:35574 has_single_reference:true ptr:1112190177280}}}
transpose / perm“tensor {dtype:DT_INT32 shape {dim {size:3}} allocation_description {requested_bytes:12 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:331 ptr:1112161658112 }}
stack“tensor {dtype:DT_INT32 shape {dim {size:3}} allocation_description {requested_bytes:12 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:332 ptr:1112161658368 }}
stack“tensor {dtype:DT_INT32 shape {dim {size:3}} allocation_description {requested_bytes:12 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:332 ptr:1112161658368 }}
stack“tensor {dtype:DT_INT32 shape {dim {size:3}} allocation_description {requested_bytes:12 allocated_bytes:256 allocator_name:“cuda_host_bfc”allocation_id:332 ptr:1112161658368 }}
1.为什么tensorflow会分配比gpu中请求的内存更多的内存?
2.是否有任何函数可以确定分配的内存量?
对于第一个问题,我客人的目的是它可以减少分配的频率。但我无法理解为什么这个机制被gpu采用而cpu内存分配器没有。
我对第二个问题更感兴趣。
有谁知道答案?任何信息都会有所帮助。
答案 0 :(得分:1)
这可能是由于记忆alignment。 所以你不能得到小于256字节的内存块,如果你想要更多,它总是256字节的倍数。 (但这并不能解释“requested_bytes:3073 allocated_bytes:4864”。)