Question

我使用＆＃34; perf stat＆＃34;命令对某些事件进行统计：

[root@root test]# perf stat -a -e "r81d0","r82d0" -v ./a
r81d0: 71800964 1269047979 1269006431
r82d0: 26655201 1284214869 1284214869

 Performance counter stats for './a':

        71,800,964 r81d0                                                        [100.00%]
        26,655,201 r82d0

       0.036892349 seconds time elapsed

（1）我知道71800964是＆＃34; r81d0＆＃34;的计数，但是1269047979和1269006431的含义是什么？<登记/> （2）＆＃34; [100.00%]＆＃34;是什么意思？

我已尝试过＃34; perf stat --help＆＃34;，但无法获得这些值的解释。

Answer 1

[root@root test]# perf stat -a -e "r81d0","r82d0" -v ./a
r81d0: 71800964 1269047979 1269006431
r82d0: 26655201 1284214869 1284214869

这是从verbose选项输出的，如内核的tools/perf/builtin-stat.c文件中所定义：

391 /*
392  * Read out the results of a single counter:
393  * aggregate counts across CPUs in system-wide mode
394  */
395 static int read_counter_aggr(struct perf_evsel *counter)

408         if (verbose) {
409                 fprintf(output, "%s: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
410                         perf_evsel__name(counter), count[0], count[1], count[2]);
411         }

计数来自struct perf_counts_values，定义为http://lxr.free-electrons.com/source/tools/perf/util/evsel.h?v=3.18#L12，其中包含三个uint64_t值的数组，名称为val，ena，run

内核填充了三个count值，并从fd中读取，使用perf_event_open()系统调用打开。 man perf_event_open：http://man7.org/linux/man-pages/man2/perf_event_open.2.html

的相关部分

   read_format
          This field specifies the format of the data returned by
          read(2) on a perf_event_open() file descriptor.

          PERF_FORMAT_TOTAL_TIME_ENABLED
                 Adds the 64-bit time_enabled field.  This can be used
                 to calculate estimated totals if the PMU is
                 overcommitted and multiplexing is happening.

          PERF_FORMAT_TOTAL_TIME_RUNNING
                 Adds the 64-bit time_running field.  This can be used
                 to calculate estimated totals if the PMU is
                 overcommitted and multiplexing is happening.  ...

如果perf stat为真，则

scale enables all TIME flags -

298         if (scale)
299                 attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
300                                     PERF_FORMAT_TOTAL_TIME_RUNNING;

所以，第一个计数器是原始事件计数;第二个与收集此事件的时间成比例，并且最后一个与总运行时间成比例。当您向perf询问有关大量事件的统计信息时，需要这样做，而这些事件无法立即被监控（硬件通常最多有5-7个性能监视器）。在这种情况下，in-kernel perf将为某些执行部分运行所需事件的子集;和子集将被更改几次。通过ena和run计数，perf可以估算多路复用时事件监控的不准确程度。

 Performance counter stats for './a':

    71,800,964 r81d0                                            [100.00%]
    26,655,201 r82d0

在你的情况下，两个事件同时被映射而不需要多路复用;您的ena和run计数器已关闭。 print_aggr函数打印它们的比例：

1137                                 val += counter->counts->cpu[cpu].val;
1138                                 ena += counter->counts->cpu[cpu].ena;
1139                                 run += counter->counts->cpu[cpu].run;

如果-r N选项重新运行任务N次以获取统计信息，则会输出

Print_noise（man： --repeat=<n>重复命令和打印平均值+ stddev（最大值：100） ）

1176                                 print_noise(counter, 1.0);

还有[100.00%]打印机：

1178                                 if (run != ena)
1179                                         fprintf(output, "  (%.2f%%)",
1180                                                 100.0 * run / ena);

如果run和ena次数相等，并且你的r82d0事件相同，则不会打印100％。您的r81d0事件的运行和ena略有不同，因此100％打印在一行中。

我知道perf stat -d可能不准确，因为它要求太多事件;而且不会有100％的多元化，而是53％。这意味着＆＃34;这个事件仅在程序运行时的53％中被计算在其中的一些随机部分中＆＃34 ;;如果你编程有几个独立的计算阶段，那么低运行/ ena比率的事件将不太准确。

“perf stat”输出是什么意思？

1 个答案: