Question

免责声明：这是微基准测试，如果您对该主题感到不满，请不要评论“过早优化是邪恶的”。

示例是针对x64，.Net4.5 Visual Studio 2012 F＃3.0的发布，并在Windows 7 x64中运行

在分析之后，我缩小了我的一个应用程序的瓶颈，所以我想提出这个问题：

观察

如果for in循环或Seq.iter内没有循环，那么很明显它们都具有相似的速度。（update2 vs update4）

如果for in循环或Seq.iter内有循环，则Seq.iter似乎比for in快2倍。（update vs update3）奇怪吗？（如果在fsi中运行它们会相似）

如果它针对anycpu并在x64中运行，则没有时间差异。所以问题就变成了：如果目标是x64

所用时间：

update: 00:00:11.4250483 // 2x as much as update3, why? updatae2: 00:00:01.4447233 updatae3: 00:00:06.0863791 updatae4: 00:00:01.4939535

源代码：

open System.Diagnostics open System [<EntryPoint>] let main argv = let pool = seq {1 .. 1000000} let ret = Array.zeroCreate 100 let update pool = for x in pool do for y in 1 .. 200 do ret.[2] <- x + y let update2 pool = for x in pool do //for y in 1 .. 100 do ret.[2] <- x let update3 pool = pool |> Seq.iter (fun x -> for y in 1 .. 200 do ret.[2] <- x + y) let update4 pool = pool |> Seq.iter (fun x -> //for y in 1 .. 100 do ret.[2] <- x) let test n = let run = match n with | 1 -> update | 2 -> update2 | 3 -> update3 | 4 -> update4 for i in 1 .. 50 do run pool let sw = new Stopwatch() sw.Start() test(1) sw.Stop() Console.WriteLine(sw.Elapsed); sw.Restart() test(2) sw.Stop() Console.WriteLine(sw.Elapsed) sw.Restart() test(3) sw.Stop() Console.WriteLine(sw.Elapsed) sw.Restart() test(4) sw.Stop() Console.WriteLine(sw.Elapsed) 0 // return an integer exit code

Answer 1

这不是一个完整的答案，但希望它能帮助你走得更远。

我可以使用相同的配置重现行为。以下是分析的简单示例：

open System

let test1() =
    let ret = Array.zeroCreate 100
    let pool = {1 .. 1000000}    
    for x in pool do
        for _ in 1..50 do
            for y in 1..200 do
                ret.[2] <- x + y

let test2() =
    let ret = Array.zeroCreate 100
    let pool = {1 .. 1000000}    
    Seq.iter (fun x -> 
        for _ in 1..50 do
            for y in 1..200 do
                ret.[2] <- x + y) pool

let time f =
    let sw = new Diagnostics.Stopwatch()
    sw.Start()
    let result = f() 
    sw.Stop()
    Console.WriteLine(sw.Elapsed)
    result

[<EntryPoint>]
let main argv =
    time test1
    time test2
    0

在此示例中，Seq.iter和for x in pool执行一次，但test1和test2之间仍有2倍的时差：

00:00:06.9264843
00:00:03.6834886

他们的IL非常相似，因此编译器优化不是问题。似乎x64抖动无法优化test1，尽管它可以使用test2来实现。有趣的是，如果我在test1中重构嵌套for循环作为函数，JIT优化再次成功：

let body (ret: _ []) x =
    for _ in 1..50 do
        for y in 1..200 do
            ret.[2] <- x + y

let test3() =
    let ret = Array.zeroCreate 100
    let pool = {1..1000000}    
    for x in pool do
        body ret x

// 00:00:03.7012302

当我使用described here技术禁用JIT优化时，这些函数的执行时间是可比较的。

为什么x64抖动在特定的例子中失败，我不知道。您可以disassemble optimized jitted code逐行比较ASM说明。也许具有良好ASM知识的人可以发现他们之间的差异。

Answer 2

当我在我的机器上运行实验时（在发布模式下使用VS 2012中的F＃3.0），我没有得到您描述的时间。重复运行时，你是否一直得到相同的数字？

我尝试了大约4次，我总是得到非常相似的数字。 Seq.iter的版本往往略快一些，但这可能没有统计意义。类似的东西（使用Stopwatch）：

test(1) = 15321ms
test(2) = 5149ms
test(3) = 14290ms
test(4) = 4999ms

我正在使用64位Windows 7的Intel Core2 Duo（2.26Ghz）笔记本电脑上运行测试。

如果目标是x64，为什么Seq.iter比for循环快2倍？

观察

所用时间：

源代码：

2 个答案: