Question

我尝试从valgrind运行cachegrind工具并获得以下输出。

背景：

通常我们会开发科学计划，如果您的程序运行1秒或3秒，那么它并不重要。但在我目前的项目中，运行时间要长得多，我认为查看这些工具可能是个好主意。通常我只使用callgrind来描述我的结果。

--14387-- warning: L3 cache found, using its data for the LL simulation.
==14387== brk segment overflow in thread #1: can't grow to 0x4a44000
==14387== (see section Limitations in user manual)
==14387== NOTE: further instances of this message will not be shown
==14387== 
==14387== I   refs:      3,642,827,372
==14387== I1  misses:        8,022,355
==14387== LLi misses:           13,650
==14387== I1  miss rate:          0.22%
==14387== LLi miss rate:          0.00%
==14387== 
==14387== D   refs:      1,343,730,074  (903,280,204 rd   + 440,449,870 wr)
==14387== D1  misses:       37,368,579  ( 34,410,595 rd   +   2,957,984 wr)
==14387== LLd misses:          361,568  (    187,256 rd   +     174,312 wr)
==14387== D1  miss rate:           2.8% (        3.8%     +         0.7%  )
==14387== LLd miss rate:           0.0% (        0.0%     +         0.0%  )
==14387== 
==14387== LL refs:          45,390,934  ( 42,432,950 rd   +   2,957,984 wr)
==14387== LL misses:           375,218  (    200,906 rd   +     174,312 wr)
==14387== LL miss rate:            0.0% (        0.0%     +         0.0%  )

我理解缓存层次结构背后的基本理论，缓存行以及错过的原因。我的问题是，我在这个主题上没有真正的世界经验。这意味着我根本不知道错过L3缓存或L1缓存有多常见。

是否有某种＆＃34;经验法则＆＃34;这描述了什么时候我不得不担心这个主题，并更多地关注内存对齐和类似的东西？

valgrind cachegrind - 什么是好结果

0 个答案: