找到计算瓶颈

时间:2017-02-17 15:30:22

标签: python tensorflow

我有一个基于Resnet的网络,它确实很慢 它使用2分钟来运行一个图像

以下是我试图找到瓶颈的方法

我尝试使用timeline来查找计算瓶颈 但是chrome://tracing没有显示任何内容。(唯一的变化是^_^ => filename

时间轴文件有值,文件大小为885.9MB

    { # for example
        "name": "map/while/LoopCond",
        "pid": 1,
        "ts": 1487341119596183,
        "cat": "DataFlow",
        "tid": 0,
        "ph": "t",
        "id": 31618
    },

当我保存时间线文件时,控制台显示以下信息,这是否重要?

tcmalloc: large alloc 1233903616 bytes == 0x7fa896f76000 @  0x7faad02b78bf 0x7faacff0b175 0x7faacff0b490 0x7faacff36328 0x7faacfed2dc3 0x7faacff846c7 0x7faacff86323 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff861f6 0x7faacff86323 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff872e2 0x7faacffa7960 0x7faacffa7b3f 0x7faacffbd484 0x7faacf1b3830 0x400649
顺便说一下,这种消息会让进程变慢吗?它显示每次迭代,但仅在第一次迭代时显示Raising pool_size_limit_

I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1677080 get requests, put_count=1802207 evicted_count=128000 eviction_rate=0.071024 and unsatisfied allocation rate=0.001728
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1808091 get requests, put_count=1942982 evicted_count=138000 eviction_rate=0.0710248 and unsatisfied allocation rate=0.00173332
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1897335 get requests, put_count=2042025 evicted_count=148000 eviction_rate=0.0724771 and unsatisfied allocation rate=0.00175773

起初,我认为PoolAllocator是由有限的GPU内存引起的。 在使用nvidia-smi检查config.gpu_options.allow_growth = True之后,我发现GPU内存使用量远远不够5833MiB / 8110MiB,对吗?

关注此Question,我添加os.environ['TF_CUDNN_USE_AUTOTUNE'] = '0',不起作用:(

0 个答案:

没有答案