Question

我有一个基于Resnet的网络，它确实很慢它使用2分钟来运行一个图像

以下是我试图找到瓶颈的方法

我尝试使用timeline来查找计算瓶颈但是chrome://tracing没有显示任何内容。（唯一的变化是^_^ =＆gt; filename）

时间轴文件有值，文件大小为885.9MB

    { # for example
        "name": "map/while/LoopCond",
        "pid": 1,
        "ts": 1487341119596183,
        "cat": "DataFlow",
        "tid": 0,
        "ph": "t",
        "id": 31618
    },

当我保存时间线文件时，控制台显示以下信息，这是否重要？

tcmalloc: large alloc 1233903616 bytes == 0x7fa896f76000 @  0x7faad02b78bf 0x7faacff0b175 0x7faacff0b490 0x7faacff36328 0x7faacfed2dc3 0x7faacff846c7 0x7faacff86323 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff861f6 0x7faacff86323 0x7faacff871ce 0x7faacff861f6 0x7faacff871ce 0x7faacff872e2 0x7faacffa7960 0x7faacffa7b3f 0x7faacffbd484 0x7faacf1b3830 0x400649

顺便说一下，这种消息会让进程变慢吗？它显示每次迭代，但仅在第一次迭代时显示Raising pool_size_limit_。

I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1677080 get requests, put_count=1802207 evicted_count=128000 eviction_rate=0.071024 and unsatisfied allocation rate=0.001728
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1808091 get requests, put_count=1942982 evicted_count=138000 eviction_rate=0.0710248 and unsatisfied allocation rate=0.00173332
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 1897335 get requests, put_count=2042025 evicted_count=148000 eviction_rate=0.0724771 and unsatisfied allocation rate=0.00175773

起初，我认为PoolAllocator是由有限的GPU内存引起的。在使用nvidia-smi检查config.gpu_options.allow_growth = True之后，我发现GPU内存使用量远远不够5833MiB / 8110MiB，对吗？

关注此Question，我添加os.environ['TF_CUDNN_USE_AUTOTUNE'] = '0'，不起作用：（

找到计算瓶颈

0 个答案: