令人困惑的perf stat结果,32个线程和2个线程

时间:2016-10-28 02:28:21

标签: perf

perf stat结果,周期为25806004593,32个线程的经过时间为0.527408365秒。但2个线程周期为4109847315,经过的时间为1.075951973秒。

为什么周期如此之多?周期不是时钟周期??????

32个帖子:

13514.208793 task-clock                #   25.624 CPUs utilized            ( +-  0.48% )
             2,440 context-switches          #    0.000 M/sec                    ( +-  0.85% )
                38 CPU-migrations            #    0.000 M/sec                    ( +-  2.63% )
           343,594 page-faults               #    0.025 M/sec                    ( +-  0.00% )
    25,806,004,593 cycles                    #    1.910 GHz                      ( +-  0.00% ) [40.31%]
    22,793,327,794 stalled-cycles-frontend   #   88.33% frontend cycles idle     ( +-  0.38% ) [40.16%]
     4,841,643,161 stalled-cycles-backend    #   18.76% backend  cycles idle     ( +-  2.60% ) [40.38%]
     5,661,365,666 instructions              #    0.22  insns per cycle        
                                             #    4.03  stalled cycles per insn  ( +-  0.29% ) [50.40%]
       572,831,916 branches                  #   42.387 M/sec                    ( +-  0.22% ) [50.30%]
           377,057 branch-misses             #    0.07% of all branches          ( +-  9.16% ) [50.52%]
     2,954,860,090 L1-dcache-loads           #  218.648 M/sec                    ( +-  0.44% ) [50.53%]
        29,016,874 L1-dcache-load-misses     #    0.98% of all L1-dcache hits    ( +-  1.90% ) [50.63%]
         5,855,196 LLC-loads                 #    0.433 M/sec                    ( +- 12.92% ) [40.55%]
         3,771,211 LLC-load-misses           #   64.41% of all LL-cache hits     ( +- 14.39% ) [40.55%]

       0.527408365 seconds time elapsed                                          ( +-  0.80% )

2个主题:

2084.894924 task-clock                #    1.938 CPUs utilized            ( +-  0.03% )
               183 context-switches          #    0.000 M/sec                    ( +-  1.42% )
                 2 CPU-migrations            #    0.000 M/sec                    ( +- 14.29% )
           343,508 page-faults               #    0.165 M/sec                    ( +-  0.00% )
     4,109,847,315 cycles                    #    1.971 GHz                      ( +-  0.10% ) [39.77%]
     1,039,138,688 stalled-cycles-frontend   #   25.28% frontend cycles idle     ( +-  0.95% ) [40.29%]
       399,472,486 stalled-cycles-backend    #    9.72% backend  cycles idle     ( +-  3.78% ) [40.71%]
     5,555,427,341 instructions              #    1.35  insns per cycle        
                                             #    0.19  stalled cycles per insn  ( +-  0.17% ) [50.80%]
       553,688,310 branches                  #  265.571 M/sec                    ( +-  0.19% ) [50.96%]
            45,997 branch-misses             #    0.01% of all branches          ( +- 11.66% ) [50.94%]
     2,940,653,305 L1-dcache-loads           # 1410.456 M/sec                    ( +-  0.16% ) [50.54%]
        25,312,789 L1-dcache-load-misses     #    0.86% of all L1-dcache hits    ( +-  0.09% ) [50.02%]
         2,266,586 LLC-loads                 #    1.087 M/sec                    ( +-  1.49% ) [39.57%]
         1,290,962 LLC-load-misses           #   56.96% of all LL-cache hits     ( +-  3.84% ) [39.49%]

       1.075951973 seconds time elapsed                                          ( +-  0.14% )

0 个答案:

没有答案