Question

这是2个函数，fun1需要1个参数，fun2需要4个额外的无用参数。当我针对x64时，fun1需要4秒，但fun2需要不到1秒。如果我针对anycpu，那么两者都不到1秒。

我在这里问了一个类似的问题 why Seq.iter is 2x faster than for loop if target is for x64?

它是在.Net 4.5 Visual Studio 2012，F＃3.0中编译，在Windows 7 x64中运行

open System
open System.Diagnostics

type Position =
    {
        a: int
        b: int
    }

[<EntryPoint>]
let main argv = 

    let fun1 (pos: Position[]) =  //<<<<<<<< here
        let functionB x y z = 4

        Array.fold2 (fun acc x y -> acc + int64 (functionB x x y)) 0L pos pos

    let fun2 (pos: Position[]) u v w x =  //<<<<<<<< here
        let functionB x y z = 4

        Array.fold2 (fun acc x y -> acc + int64 (functionB x x y)) 0L pos pos



    let s = {a=2;b=3}
    let pool = [|s;s;s|]

    let test1 n =
        let mutable x = 0L
        for i in 1 .. n do
            x <- fun1 pool

    let test2 n =
        let mutable x = 0L
        for i in 1 .. n do
            x <- fun2 pool 1 2 3 4

    let sw = new Stopwatch()
    sw.Start()
    test2 10000000
    sw.Stop()
    Console.WriteLine(sw.Elapsed)

    sw.Restart()
    test1 10000000
    sw.Stop()
    Console.WriteLine(sw.Elapsed)


    0 // return an integer exit code

Answer 1

这不是一个完整的答案，它首先是对问题的诊断。

我可以使用相同的配置重现行为。如果在Tools -> Options -> F# Tools -> F# Interactive中打开F＃Interactive 64位，则可以观察到相同的行为。

与the other question不同，x64抖动不是问题。事实证明，与test1相比，Project属性中的“生成尾调用”选项导致test2相当大的减速。如果关闭该选项，则两种情况的速度相似。

另一方面，您可以在inline上使用fun1关键字，以便不需要尾调用。无论fun2是否内联，都有两个例子在执行时间上具有可比性。

也就是说，将tail.操作码添加到fun1会比使用fun2慢得多（{1}}更慢。您可以联系F＃团队进行进一步调查。

Answer 2

差异几乎肯定是JITer的一个怪癖。它还解释了不一致的结果。这是像这样的微基准测试的常见问题。执行一个或多个方法的冗余执行，以便在幕后编译整个事物，并计算最后一个。它们是完全相同的。

由于这个怪癖，你可以获得比这更奇怪的结果。

针对x64有时会导致非常糟糕的性能

2 个答案: