在问题cpu cache performance. store misses vs load misses中,没有关于在哪里找到由perf列出的事件文件的答案
我无法通过man perf
和perf help list
找到它,
我阅读了Intel @ 64和AMD64的事件文档,其事件格式如下所示
Last Level Cache References — Event select 2EH, Umask 4FH
那它在哪里?
编辑:要清楚,我想通过perf list
答案 0 :(得分:1)
Linux内核中的perf子系统的源代码记录了perf
branches
cycles
之类的预定义LLC-load-misses
事件列表。该列表部分映射到不同CPU模型和微架构的各种硬件事件。使用来自ocperf.py
的andikleen's pmu-tools(和toplev.py)(如果您的CPU是英特尔)可能更有用,其中包含来自英特尔文档的事件名称(ocperf不是官方的,但它是由英特尔员工编写的)并使用来自https://download.01.org/perfmon/ https://download.01.org/perfmon/readme.txt&#34的官方列表;此软件包包含英特尔处理器的性能监控事件列表")
对于x86和x86_64 perf
,这些(古老的)预定义/通用名称映射到arch/x86/events
目录,例如对于所有英特尔酷睿微架构检查arch/x86/events/intel/core.c
并通过其代码搜索微架构名称(Core,Core2,NHM = Nehalem,WSM = Westmere,SNB = SandyBridge,IVB = IvyBridge,HSW = HaSWell,BDW = BroaDWell,SKL = SKyLake,SLM = SiLverMont等来自lists和{{3} })。对于Skylake,amd处有结构,我们看到PREFETCH计数器没有映射到所有缓存("不支持")
static __initconst const u64 skl_hw_cache_event_ids
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_INST_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_INST_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x0,
...
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
和额外的结构,为OFFCORE_RESPONSE等事件定义额外的标志/掩码:
static __initconst const u64 skl_hw_cache_extra_regs
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,