为什么比Seq.find

时间:2016-02-11 18:39:39

标签: arrays algorithm f#

我有一个数组sums,它给出了函数f的所有可能总和。此函数接受整数(例如1到200之间,但同样适用于1和10000)并将它们转换为double。我想将sums存储为数组,因为我还没有想出如何在没有循环的情况下完成我需要的算法。

以下是我生成sums

的代码
let f n k = exp (double(k)/double(n)) - 1.0


let n = 200
let maxLimit = int(Math.Round(float(n)*1.5))

let FunctionValues = [|1..maxLimit|] |> Array.map (fun k -> f n k)

let sums = FunctionValues |> Array.map (fun i -> Array.map (fun j -> j + i) FunctionValues) |> Array.concat |> Array.sort

我发现数组sums的某些元素我想要找到一些整数,当输入到函数f然后添加时将等于sums中的值。我可以将整数存储在sums中,但我发现这会破坏我的记忆。

现在我有两种算法。算法1使用一个简单的循环和一个可变的int来存储我关心的值。它应该不是非常有效,因为当它找到所有可能的整数时没有break语句。我尝试实现更具功能性的算法2,但我发现它更慢(约10%慢或4200ms对4600ms,n = 10000),尽管Seq是懒惰的。这是为什么?

算法1:

let mutable a = 0
let mutable b = 0
let mutable c = 0
let mutable d = 0
for i in 1..maxLimit do
    for j in i..maxLimit do
        if sums.[bestI] = f n i + f n j then
            a <- i
            b <- j
        if sums.[bestMid] = f n i + f n j then
            c <- i
            d <- j

算法2:

let findNM x = 
    let seq = {1..maxLimit} |> Seq.map (fun k -> (f n k, k))
    let get2nd3rd (a, b, c) = (b, c)
    seq |> Seq.map (fun (i, n) -> Seq.map (fun (j, m) -> (j + i, n, m) ) seq) 
        |> Seq.concat |> Seq.find (fun (i, n, m) -> i = x)
        |>  get2nd3rd

let digitsBestI = findNM sums.[bestI]
let digitsBestMid = findNM sums.[bestMid]

let a = fst digitsBestI
let b = snd digitsBestI
let c = fst digitsBestMid
let d = snd digitsBestMid

修改:请注意,数组sums的长度为maxLimit*maxLimit,长度为nbestIbestMid是0和maxLimit*maxLimit之间的索引。出于这个问题的目的,它们可以是该范围内的任何数字。它们的具体价值并不是特别相关。

1 个答案:

答案 0 :(得分:4)

我稍微扩展了OP代码以便对其进行分析

open System

let f n k   = exp (double(k)/double(n)) - 1.0

let outer   = 200
let n       = 200
let maxLimit= int(Math.Round(float(n)*1.5))

let FunctionValues = [|1..maxLimit|] |> Array.map (fun k -> f n k)

let random = System.Random 19740531

let sums = FunctionValues |> Array.map (fun i -> Array.map (fun j -> j + i) FunctionValues) |> Array.concat |> Array.sort

let bests = 
  [| for i in [1..outer] -> (random.Next (n, maxLimit*maxLimit), random.Next (n, maxLimit*maxLimit))|]

let stopWatch = 
  let sw = System.Diagnostics.Stopwatch ()
  sw.Start ()
  sw

let timeIt (name : string) (a : int*int -> 'T) : unit = 
  let t = stopWatch.ElapsedMilliseconds
  let v = a (bests.[0])
  for i = 1 to (outer - 1) do
    a bests.[i] |> ignore
  let d = stopWatch.ElapsedMilliseconds - t
  printfn "%s, elapsed %d ms, result %A" name d v

let algo1 (bestI, bestMid) =
  let mutable a = 0
  let mutable b = 0
  let mutable c = 0
  let mutable d = 0
  for i in 1..maxLimit do
    for j in i..maxLimit do
      if sums.[bestI] = f n i + f n j then
        a <- i
        b <- j
      if sums.[bestMid] = f n i + f n j then
        c <- i
        d <- j

  a,b,c,d

let algo2 (bestI, bestMid) =
  let findNM x = 
    let seq = {1..maxLimit} |> Seq.map (fun k -> (f n k, k))
    let get2nd3rd (a, b, c) = (b, c)
    seq |> Seq.map (fun (i, n) -> Seq.map (fun (j, m) -> (j + i, n, m) ) seq) 
        |> Seq.concat |> Seq.find (fun (i, n, m) -> i = x)
        |> get2nd3rd

  let digitsBestI = findNM sums.[bestI]
  let digitsBestMid = findNM sums.[bestMid]

  let a = fst digitsBestI
  let b = snd digitsBestI
  let c = fst digitsBestMid
  let d = snd digitsBestMid

  a,b,c,d

let algo3 (bestI, bestMid) =
  let rec find best i j = 
    if best = f n i + f n j then i, j
    elif i = maxLimit && j = maxLimit then 0, 0
    elif j = maxLimit then find best (i + 1) 1
    else find best i (j + 1)
  let a, b = find sums.[bestI] 1 1
  let c, d = find sums.[bestMid] 1 1
  a, b, c, d

let algo4 (bestI, bestMid) =
  let rec findI bestI mid i j = 
    if bestI = f n i + f n j then 
      let x, y = mid
      i, j, x, y
    elif i = maxLimit && j = maxLimit then 0, 0, 0, 0
    elif j = maxLimit then findI bestI mid (i + 1) 1
    else findI bestI mid i (j + 1)

  let rec findMid ii bestMid i j = 
    if bestMid = f n i + f n j then 
      let x, y = ii
      x, y, i, j
    elif i = maxLimit && j = maxLimit then 0, 0, 0, 0
    elif j = maxLimit then findMid ii bestMid (i + 1) 1
    else findMid ii bestMid i (j + 1)

  let rec find bestI bestMid i j = 
    if bestI = f n i + f n j then findMid (i, j) bestMid i j
    elif bestMid = f n i + f n j then findI bestI (i, j) i j
    elif i = maxLimit && j = maxLimit then 0, 0, 0, 0
    elif j = maxLimit then find bestI bestMid (i + 1) 1
    else find bestI bestMid i (j + 1)

  find sums.[bestI] sums.[bestMid] 1 1

[<EntryPoint>]
let main argv =

  timeIt "algo1" algo1
  timeIt "algo2" algo2
  timeIt "algo3" algo3
  timeIt "algo4" algo4

  0

我的机器上的测试结果:

algo1, elapsed 438 ms, result (162, 268, 13, 135)
algo2, elapsed 1012 ms, result (162, 268, 13, 135)
algo3, elapsed 348 ms, result (162, 268, 13, 135)
algo4, elapsed 322 ms, result (162, 268, 13, 135)

algo1使用天真的for loop实现。 algo2使用依赖Seq.find的更精确的算法。我稍后会介绍algo3algo4

OP想知道为什么天真的algo1表现得更好,即使它比基于懒惰algo2(基本上是Seq)的IEnumerable<>做得更多。

答案是Seq抽象引入了开销并阻止了有用的优化。

我通常会查看生成的IL代码以了解正在发生的事情(有许多优秀的.NET反编译器,如ILSpy)。

让我们看看algo1(反编译为C#)

// Program
public static Tuple<int, int, int, int> algo1(int bestI, int bestMid)
{
  int a = 0;
  int b = 0;
  int c = 0;
  int d = 0;
  int i = 1;
  int maxLimit = Program.maxLimit;
  if (maxLimit >= i)
  {
    do
    {
      int j = i;
      int maxLimit2 = Program.maxLimit;
      if (maxLimit2 >= j)
      {
        do
        {
          if (Program.sums[bestI] == Math.Exp((double)i / (double)200) - 1.0 + (Math.Exp((double)j / (double)200) - 1.0))
          {
            a = i;
            b = j;
          }
          if (Program.sums[bestMid] == Math.Exp((double)i / (double)200) - 1.0 + (Math.Exp((double)j / (double)200) - 1.0))
          {
            c = i;
            d = j;
          }
          j++;
        }
        while (j != maxLimit2 + 1);
      }
      i++;
    }
    while (i != maxLimit + 1);
  }
  return new Tuple<int, int, int, int>(a, b, c, d);
}
然后将

algo1扩展为有效while loop。另外f是内联的。 JITter可以轻松地从中创建高效的机器代码。

当我们查看algo2解压缩完整结构对于这篇文章来说太多了所以我专注于findNM

internal static Tuple<int, int> findNM@48(double x)
{
  IEnumerable<Tuple<double, int>> seq = SeqModule.Map<int, Tuple<double, int>>(new Program.seq@49(), Operators.OperatorIntrinsics.RangeInt32(1, 1, Program.maxLimit));
  FSharpTypeFunc get2nd3rd = new Program.get2nd3rd@50-1();
  Tuple<double, int, int> tupledArg = SeqModule.Find<Tuple<double, int, int>>(new Program.findNM@52-1(x), SeqModule.Concat<IEnumerable<Tuple<double, int, int>>, Tuple<double, int, int>>(SeqModule.Map<Tuple<double, int>, IEnumerable<Tuple<double, int, int>>>(new Program.findNM@51-2(seq), seq)));
  FSharpFunc<Tuple<double, int, int>, Tuple<int, int>> fSharpFunc = (FSharpFunc<Tuple<double, int, int>, Tuple<int, int>>)((FSharpTypeFunc)((FSharpTypeFunc)get2nd3rd.Specialize<double>()).Specialize<int>()).Specialize<int>();
  return Program.get2nd3rd@50<double, int, int>(tupledArg);
}

我们看到它需要创建实现IEnumerable<>的多个对象以及传递给更高阶函数(如Seq.find)的函数对象。虽然JITter原则上可以内联循环,但由于时间限制和内存原因,它很可能不会。这意味着每次调用函数对象都是虚拟调用,虚拟调用非常昂贵(提示:检查机器代码)。因为虚拟调用可能会执行任何操作,从而阻止优化,例如使用SIMD指令。

OP注意到F#循环表达式缺少break/continue构造,这在编写高效for loops时很有用。然而,F#会隐式支持它,因为如果你编写一个尾递归函数,F#将它展开到一个使用break/continue提前退出的有效循环中。

algo3是使用尾递归实现algo2的示例。反汇编的代码是这样的:

internal static Tuple<int, int> find@66(double best, int i, int j)
{
  while (best != Math.Exp((double)i / (double)200) - 1.0 + (Math.Exp((double)j / (double)200) - 1.0))
  {
    if (i == Program.maxLimit && j == Program.maxLimit)
    {
      return new Tuple<int, int>(0, 0);
    }
    if (j == Program.maxLimit)
    {
      double arg_6F_0 = best;
      int arg_6D_0 = i + 1;
      j = 1;
      i = arg_6D_0;
      best = arg_6F_0;
    }
    else
    {
      double arg_7F_0 = best;
      int arg_7D_0 = i;
      j++;
      i = arg_7D_0;
      best = arg_7F_0;
    }
  }
  return new Tuple<int, int>(i, j);
}

这使我们能够编写惯用的功能代码,并在避免堆栈溢出的同时获得非常好的性能。

在我意识到在F#中实现了良好的尾递归之前,我尝试在while测试表达式中编写带有可变逻辑的高效while循环。为了人性,现在废除了代码。

algo4是一个优化版本,因为它sumsbestMid的{​​{1}}只迭代bestI,与algo1非常相似algo4如果可以的话就提前退出。

希望这有帮助