我正在使用Linux perf分析两个版本的大型应用程序。其中一个版本的性能可重现性降低。问题是影响运行大约需要10分钟才能完成并运行
perf stat
可以看出上下文切换的数量存在很大差异:
2759681,344820 task-clock (msec) # 4,089 CPUs utilized
1.976.068 context-switches # 0,716 K/sec
288.370 cpu-migrations # 0,104 K/sec
1.065.076 page-faults # 0,386 K/sec
9.600.316.147.196 cycles # 3,479 GHz
9.608.308.311.681 instructions # 1,00 insn per cycle
1.847.613.212.847 branches # 669,502 M/sec
29.342.163.081 branch-misses # 1,59% of all branches
674,891697479 seconds time elapsed
与
3045676,296012 task-clock (msec) # 4,093 CPUs utilized
22.156.426 context-switches # 0,007 M/sec
385.364 cpu-migrations # 0,127 K/sec
1.066.383 page-faults # 0,350 K/sec
10.505.321.454.387 cycles # 3,449 GHz
9.723.994.869.100 instructions # 0,93 insn per cycle
1.869.145.049.594 branches # 613,704 M/sec
30.241.815.060 branch-misses # 1,62% of all branches
744,170941002 seconds time elapsed
运行
perf record -e context-switches -ag -T
提供以下系统调用
Children Self Samples Command Shared Object Symbol
+ 44,06% 44,06% 170846 swapper [kernel.kallsyms] [k] schedule_idle
+ 33,07% 33,07% 127004 Thread (pooled) [kernel.kallsyms] [k] schedule
与
Children Self Samples Command Shared Object Symbol
+ 49,02% 49,02% 958827 swapper [kernel.kallsyms] [k] schedule_idle
+ 43,96% 43,96% 855603 Thread (pooled) [kernel.kallsyms] [k] schedule
因此,样本数量的差异几乎是一个数量级。我的问题是我如何进一步调查这个问题,因为我可以访问这两个版本的源代码,但它很大而且我不太了解它?
问题是锁定,我可以通过运行gdb发现它,在执行过程中中断,捕获系统调用并打印回溯
(gdb) catch syscall
(gdb) bt
perf报告的信息是
Children Self Command Shared Object Symbol ◆
- 49,02% 49,02% swapper [kernel.kallsyms] [k] schedule_idle ▒
- secondary_startup_64 ▒
- 42,82% start_secondary ▒
cpu_startup_entry ▒
do_idle ▒
schedule_idle ▒
+ 6,20% x86_64_start_kernel
- 43,96% 43,96% Thread (pooled) [kernel.kallsyms] [k] schedule ▒
- 43,32% syscall ▒
- 43,32% entry_SYSCALL_64_after_hwframe ▒
do_syscall_64 ▒
sys_futex ▒
do_futex ▒
futex_wait ▒
futex_wait_queue_me ▒
schedule
这不是很有用,因为它不打印谁进行了系统调用。使用GDB的步骤适用于我的情况,但可能很乏味。你知道任何跟踪toool,或者在这种情况下有用的选项吗? Brendan Gregg的博客http://www.brendangregg.com/blog/2015-07-08/choosing-a-linux-tracer.html上有一系列工具,但我对它们没有多少经验。