我使用timer:tc/3
在紧密循环(比如5000次迭代)中测试函数的性能:
{Duration_us, _Result} = timer:tc(M, F, [A])
返回持续时间(以微秒为单位)和函数结果。为了论证,持续时间是N微秒。
然后,我对迭代结果进行简单的平均计算。如果我在timer:sleep(1)
电话之前拨打timer:tc/3
函数电话,则所有迭代的平均持续时间始终为>没有睡觉的平均值:
timer:sleep(1),
timer:tc(M, F, [A]).
这对我来说没有多大意义,因为timer:tc/3
函数应该是原子的,而不关心它之前发生的任何事情。
任何人都可以解释这个奇怪的功能吗?它是否与调度和减少有关?
答案 0 :(得分:1)
测量性能是一项复杂的任务,尤其是在新硬件和现代操作系统中。有许多东西可以摆弄你的结果。首先,你并不孤单。当您在台式机或笔记本电脑上进行测量时,可能会有其他过程干扰您的测量,包括系统测量。第二件事,就是硬币本身。 Moder CPU具有许多很酷的功能,可以控制性能和功耗。它们可以在过热之前短时间提升性能,当同一芯片上的其他CPU或同一CPU上的其他超线程无法工作时,它们可以提高性能。另一方面,当没有足够的工作时,它们可以进入省电模式,并且CPU对突然变化的反应不够快。很难说它是否属于你的情况,但以前的工作或缺乏它不会影响你的测量是天真的。您应该始终注意在稳定状态下测量足够长的时间(至少几秒),并尽可能多地去除可能影响测量的其他因素。 (并且不要忘记Erlang中的GC。)
答案 1 :(得分:1)
你的意思是这样的:
4> FOO:FOO(10000)
其中:
-module(foo).
-export([foo/1, baz/1]).
foo(N) -> TL = bar(N), {TL,sum(TL)/N} .
bar(0) -> [];
bar(N) ->
timer:sleep(1),
{D,_} = timer:tc(?MODULE, baz, [1000]),
[D|bar(N-1)]
.
baz(0) -> ok;
baz(N) -> baz(N-1).
sum([]) -> 0;
sum([H|T]) -> H + sum(T).
我试过这个,这很有意思。使用sleep语句,timer:tc / 3返回的平均时间为19到22微秒,并且在睡眠被注释掉后,平均值下降到4到6微秒。相当戏剧化!
我注意到时间上有人工制品,所以这样的事件(这些数字是由计时器返回的单个微秒时间:tc / 3)并不罕见:
---- snip ----
5,5,5,6,5,5,5,6,5,5,5,6,5,5,5,5,4,5,5,5,5,5,4,5,5,5,5,6,5,5,
5,6,5,5,5,5,5,6,5,5,5,5,5,6,5,5,5,6,5,5,5,5,5,5,5,5,5,5,4,5,
5,5,5,6,5,5,5,6,5,5,7,8,7,8,5,6,5,5,5,6,5,5,5,5,4,5,5,5,5,
14,4,5,5,4,5,5,4,5,4,5,5,5,4,5,5,4,5,5,4,5,4,5,5,5,4,5,5,4,
5,5,4,5,4,5,5,4,4,5,5,4,5,5,4,4,4,4,4,5,4,5,5,4,5,5,5,4,5,5,
4,5,5,4,5,4,5,5,5,4,5,5,4,5,5,4,5,4,5,4,5,4,5,5,4,4,4,4,5,4,
5,5,54,22,26,21,22,22,24,24,32,31,36,31,33,27,25,21,22,21,
24,21,22,22,24,21,22,21,24,21,22,22,24,21,22,21,24,21,22,21,
23,27,22,21,24,21,22,21,24,22,22,21,23,22,22,21,24,22,22,21,
24,21,22,22,24,22,22,21,24,22,22,22,24,22,22,22,24,22,22,22,
24,22,22,22,24,22,22,21,24,22,22,21,24,21,22,22,24,22,22,21,
24,21,23,21,24,22,23,21,24,21,22,22,24,21,22,22,24,21,22,22,
24,22,23,21,24,21,23,21,23,21,21,21,23,21,25,22,24,21,22,21,
24,21,22,21,24,22,21,24,22,22,21,24,22,23,21,23,21,22,21,23,
21,22,21,23,21,23,21,24,22,22,22,24,22,22,41,36,30,33,30,35,
21,23,21,25,21,23,21,24,22,22,21,23,21,22,21,24,22,22,22,24,
22,22,21,24,22,22,22,24,22,22,21,24,22,22,21,24,22,22,21,24,
22,22,21,24,21,22,22,27,22,23,21,23,21,21,21,23,21,21,21,24,
21,22,21,24,21,22,22,24,22,22,22,24,21,22,22,24,21,22,21,24,
21,23,21,23,21,22,21,23,21,23,22,24,22,22,21,24,21,22,22,24,
21,23,21,24,21,22,22,24,21,22,22,24,21,22,21,24,21,22,22,24,
22,22,22,24,22,22,21,24,22,21,21,24,21,22,22,24,21,22,22,24,
24,23,21,24,21,22,24,21,22,21,23,21,22,21,24,21,22,21,32,31,
32,21,25,21,22,22,24,46,5,5,5,5,5,4,5,5,5,5,6,5,5,5,5,5,5,4,
6,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,
5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,4,6,4,6,5,5,5,5,5,5,4,6,5,5,5,
5,4,5,5,5,5,5,5,6,5,5,5,5,4,5,5,5,5,5,5,6,5,5,5,5,5,5,5,6,5,
5,5,5,4,5,5,6,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,6,5,5,5,5,5,5,5,
6,5,5,5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,5,4,5,4,5,5,5,5,5,6,5,5,
5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,
---- snip ----
我认为这是你所指的效果,不过当你说总是> N ,是总是,还是只是?反正并不总是适合我。
上述结果提取物没有睡眠。通常在使用睡眠定时器时:tc / 3在大多数时间没有睡眠的情况下返回4或5等低频时间,但有时像22这样大的时间,并且睡眠到位时通常很大,如22,偶尔批量次低。
为什么会发生这种情况当然不明显,因为睡眠真的只是意味着收益。我想知道这一切是不是CPU缓存。毕竟,特别是在一台不忙的机器上,人们可能会认为没有睡眠的情况可以一次性执行大部分代码而不会将其移动到另一个核心,而不会对核心执行太多其他操作,从充分利用缓存......但是当你睡觉,然后屈服,然后再回来时,缓存命中的可能性可能会大大降低。