为什么LINQ(c#)与Seq(f#)之间存在性能差异

时间:2017-02-06 18:04:40

标签: c# performance linq f#

我制作了非常简单的C#和F#测试程序。

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace ConsoleApplication2
{
    class Program
    {
        static int Remainder(int num)
        {
            return num % 2;
        }
        static int SumOfremainders(IEnumerable<int> list)
        {
            var sum = 0;
            foreach (var num in list)
            {
                sum += Remainder(num);
            }
            return sum;
        }

        static void Main(string[] args)
        {
            Stopwatch sw = new Stopwatch();
            var nums = Enumerable.Range(1, 10000000);
            sw.Start();
            var a = SumOfremainders(nums);
            sw.Stop();
            Console.WriteLine("Duration " + (sw.ElapsedMilliseconds));
            Console.WriteLine("Sum of remainders: {0}", a);
        }
    }
}


let remainder x = x % 2   

let sumORemainders n = 
   n
   |> Seq.map(fun n-> remainder n)
   |> Seq.sum

let seqb = Seq.init 10000000(fun n->n)
let timer =System.Diagnostics.Stopwatch()
timer.Start()
let a =(sumORemainders seqb )
timer.Stop()
printfn "Elapsed Time: "
System.Console.WriteLine timer.ElapsedMilliseconds
printfn "Sum of squares of 1-100: %d"  a


[<EntryPoint>]
let main argv = 

    0 // return an integer exit code

c#71 ms f#1797 ms

我从F#制作了第二个版本,其工作方式与c#

类似
let remainder x = x % 2   

let sumORemainders (input:seq<int>)  =
    let mutable sum = 0
    let en = input.GetEnumerator()
    while (en.MoveNext()) do 
        sum <- sum + remainder en.Current 
    sum


let seqb = Seq.init 10000000(fun n->n)
let timer =System.Diagnostics.Stopwatch()
timer.Start()
let a =(sumORemainders seqb )
timer.Stop()
printfn "Elapsed Time: "
System.Console.WriteLine timer.ElapsedMilliseconds
printfn "Sum of squares of 1-100: %d"  a


[<EntryPoint>]
let main argv = 

    0 // return an integer exit code

但结果并没有显着改变(1650ms)

我不明白两种语言之间速度的巨大差异。

这两个程序有非常相似的IL代码,都使用IEnumerable,而且F#用操作代替函数调用。

我根据f#IL代码重写了c#代码。

static int SumOfremainders(IEnumerable<int> list)
        {
            var sum = 0;
            IEnumerator<int> e = list.GetEnumerator();
            while (e.MoveNext())
            {
                sum += e.Current % 2;
            }
            return sum;
        }

两个程序的IL代码相同但速度仍然非常不同。 感谢Foggy Finder的IL差异

慢代码

[CompilationMapping(SourceConstructFlags.Module)]
public static class Program
{
    [Serializable]
    internal class seqb@18 : FSharpFunc<int, int>
    {
        internal seqb@18()
        {
        }

        public override int Invoke(int n)
        {
            return n;
        }
    }

    [CompilationMapping(SourceConstructFlags.Value)]
    public static IEnumerable<int> seqb
    {
        get
        {
            return $Program.seqb@18;
        }
    }

快速代码

[CompilationMapping(SourceConstructFlags.Module)]
public static class Program
{
    [CompilationMapping(SourceConstructFlags.Value)]
        public static int[] seqb
        {
            get
            {
                return $Program.seqb@20;
            }
        }

1 个答案:

答案 0 :(得分:9)

OP看到性能差异的主要原因是因为F#中的Seq.init很慢。原因是每次迭代Seq.uptoSeq.init使用)分配一个新的Lazy<_>对象。您可以在Seq source中看到这一点。 如果您具有像fun n -> n % 2这样的低开销函数,则新Lazy<_>对象的成本以及评估它(互斥锁定和解锁)需要花费大量时间。

OP看到性能差异的第二个原因是F#中的Seq通常很慢。这可以通过此PR

中的manofstick来解决

PR inplace F#Seq与现有替代品相比表现非常好(一些细节here

所有这些说明我准备了一些比较用户发布计算的不同方法(除了显而易见的total / 2)。

open CsPerfs
open Nessos.Streams
open System.Diagnostics
open System.Linq
open System.Numerics


// now () returns current time in milliseconds since start
let now : unit -> int64 =
  let sw = Stopwatch ()
  sw.Start ()
  fun () -> sw.ElapsedMilliseconds

// time estimates the time 'action' repeated a number of times
let time repeat action =
  let inline cc i       = System.GC.CollectionCount i

  let v                 = action ()

  System.GC.Collect (2, System.GCCollectionMode.Forced, true)

  let bcc0, bcc1, bcc2  = cc 0, cc 1, cc 2
  let b                 = now ()

  for i in 1..repeat do
    action () |> ignore

  let e = now ()
  let ecc0, ecc1, ecc2  = cc 0, cc 1, cc 2

  v, (e - b), ecc0 - bcc0, ecc1 - bcc1, ecc2 - bcc2

[<EntryPoint>]
let main argv = 
  let count = 10000000

  let outers = 
    [|
      1000
    |]

  for outer in outers do
    let inner = count / outer

    let fsImperativeTest () = 
      let mutable sum = 0
      for n = 0 to inner-1 do
        sum <- sum + n % 2
      sum

    let fsLinqTest () = 
      Enumerable.Range(0, inner).Select(fun n -> n % 2).Sum()
    let fsNessosTest () = 
      Stream.initInfinite id
      |> Stream.take inner
      |> Stream.map (fun n -> n % 2)
      |> Stream.sum
    let fsOpTest () = 
      let remainder x = x % 2   
      let sumORemainders (input:seq<int>)  =
          let mutable sum = 0
          use en = input.GetEnumerator()
          while (en.MoveNext()) do 
              sum <- sum + remainder en.Current 
          sum
      let seqb = Seq.init inner id
      sumORemainders seqb
    let fsSseTest () = 
      let inc         = Vector<int>.One
      let one         = Vector<int>.One
      let mutable sum = Vector<int>.Zero
      let mutable n   = Vector<int> [|0..3|]
      for n4 = 0 to inner/4-1 do
        n <- n + inc
        sum <- sum + (n &&& one)
      sum.[0] + sum.[1] + sum.[2] + sum.[3]
    let fsSeqTest () = 
      Seq.init inner id 
        |> Seq.map (fun n -> n % 2)
        |> Seq.sum
    let fsSeqVariantTest () = 
      seq { for n = 0 to inner-1 do yield n }
        |> Seq.map (fun n -> n % 2)
        |> Seq.sum

    let csImperativeTest = 
      let f = Perfs.CsImperativeTest inner
      fun () -> f.Invoke ()
    let csLinqTest = 
      let f = Perfs.CsLinqTest inner
      fun () -> f.Invoke ()
    let csOpTest =
      let f = Perfs.CsOpTest inner
      fun () -> f.Invoke ()

    let tests =
      [|
        "fsImperativeTest"  , fsImperativeTest
        "fsLinqTest"        , fsLinqTest
        "fsNessosTest"      , fsNessosTest
        "fsOpTest"          , fsOpTest
        "fsSeqTest"         , fsSeqTest
        "fsSeqVariantTest"  , fsSeqVariantTest
        "fsSseTest"         , fsSseTest
        "csImperativeTest"  , csImperativeTest
        "csLinqTest"        , csLinqTest
        "csOpTest"          , csOpTest
      |]

    printfn "Test run - total count: %d, outer: %d, inner: %d" count outer inner

    for name, test in tests do
      printfn "Running %s..." name
      let v, ms, cc0, cc1, cc2 = time outer test
      printfn "  it took %d ms - collection count is %d,%d,%d - result is %A" ms cc0 cc1 cc2 v 


  0

匹配的C#代码:

namespace CsPerfs
{
  using System;
  using System.Collections.Generic;
  using System.Linq;

    public static class Perfs
    {
      static int Remainder(int num)
      {
          return num % 2;
      }

      static int SumOfremainders(IEnumerable<int> list)
      {
          var sum = 0;
          foreach (var num in list)
          {
              sum += Remainder(num);
          }
          return sum;
      }

      public static Func<int> CsOpTest (int count)
      {
        return () => SumOfremainders (Enumerable.Range(1, count));
      }

      public static Func<int> CsImperativeTest (int count)
      {
        return () =>
          {
            var sum = 0;
            for (var n = 0; n < count; ++n)
            {
              sum += n % 2;
            }
            return sum;
          };
      }

      public static Func<int> CsLinqTest (int count)
      {
        return () => Enumerable.Range (0, count).Select (n => n % 2).Sum ();
      }
    }
}

在我的机器上运行的性能数字(Intel Core I5)在.NET 4.6.1 64bit上运行:

Test run - total count: 10000000, outer: 1000, inner: 10000
Running fsImperativeTest...
  it took 20 ms - collection count is 0,0,0 - result is 5000
Running fsLinqTest...
  it took 124 ms - collection count is 0,0,0 - result is 5000
Running fsNessosTest...
  it took 56 ms - collection count is 0,0,0 - result is 5000
Running fsOpTest...
  it took 1320 ms - collection count is 661,0,0 - result is 5000
Running fsSeqTest...
  it took 1477 ms - collection count is 661,0,0 - result is 5000
Running fsSeqVariantTest...
  it took 512 ms - collection count is 0,0,0 - result is 5000
Running fsSseTest...
  it took 2 ms - collection count is 0,0,0 - result is 5000
Running csImperativeTest...
  it took 19 ms - collection count is 0,0,0 - result is 5000
Running csLinqTest...
  it took 122 ms - collection count is 0,0,0 - result is 5000
Running csOpTest...
  it took 58 ms - collection count is 0,0,0 - result is 5000
Press any key to continue . . .

Seq做得最差,也消耗记忆力。如果F#和C#代码都使用LINQ,则没有预期的真正差异。 Nessos是F#(和C#)的高性能数据管道,效果明显更好。

&#34;硬编码&#34; for循环做得更好,最快的解决方案是通过System.Numerics.Vectors使用SSE。很遗憾System.Numerics.Vectors不支持%,这使得比较有点不公平。

因此,差异不是语言问题,而是图书馆问题。