Question

我正在摆弄F＃并使用FSI REPL我注意到我的初学者天真的Eratosthenes Sieve的两个稍微不同的实现之间存在巨大的效率差异。第一个带有额外的if：

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (n % 5 = 0) || (n % 3 = 0) then
            sieve max (current + 2) (current::pList)
        else if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

而第二个没有：

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

具有额外if分支的那个实际上较慢，尽管看起来它应该更快。然后，我使用以下内容在REPL中计时两个实现：

#time sieve 200000 2 [] #time

并且在我的机器上看到，使用额外if语句的实现大约需要2分钟，而每次执行时没有花费大约1分钟。这怎么可能？通过添加一个处理3或5的倍数的if语句，它实际上比仅映射整个列表慢，然后查找素数列表中是否有数字的除数。 怎么样？是否只是为了处理列表而对F＃进行了优化？

Answer 1

如果在第一个筛子中，可能是一个捷径，实际上会改变一些行为。它不是删除除以3和5的所有内容，而是实际添加它。很容易看出你是否比较输出：

1st sieve: [19; 17; 15; 13; 11; 9; 7; 5; 3; 2]
2st sieve: [19; 17; 13; 11; 7; 5; 3; 2]

我认为你想要的是这样的：

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (pList)

然而，在这种情况下，它将不包括5（显然）。所以正确的代码是

if (n <> 5 && n % 5 = 0) || (n <> 3 && n % 3 = 0) then
    sieve max (current + 2) (pList)

检查上面代码的性能 - 应该没问题。

Answer 2

额外的if执行额外的计算，但不会破坏执行流程，这会愉快地继续到第二个if。所以实际上，你只为你的函数添加了一些无用的计算。难怪现在需要更长时间！

你必须考虑这样的事情：

if (a)
    return b;
if (x)
    return y;
else 
    return z;

嗯，这在F＃中不起作用，就像它在C＃或Java中工作一样，或者你在考虑的任何方面。 F＃没有“早退”。没有“陈述”，一切都是表达，一切都有结果。

添加这样无用的计算实际上会产生警告。如果你注意警告，你应该注意到一个说“这个值被丢弃”或其他一些。编译器试图通过指向无用的函数调用来帮助你。

要解决此问题，请将第二个if替换为elif：

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (current::pList)
elif (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
    sieve max (current + 2) (pList)
else
    sieve max (current + 2) (current::pList)

这将使第二个分支仅在第一个分支失败时执行。

P.S。想想看，没有if的{{1}}甚至不应该编译，因为它的结果类型无法确定。我不确定那里发生了什么。

P.P.S。 else更好地表达为List.map f |> List.contains true。短的和都更高效。

Answer 3

当然，列表不一定有效。我曾经创建了一个函数来创建一个布尔数组，其中每个素数都是真的，每个非素数都是假的：

let sieveArray limit =
    let a = Array.create limit true
    let rec setArray l h s x =
        if l <= h then
            a.[l] <- x
            setArray (l + s) h s x
    a.[0] <- false; a.[1] <- false
    for i = 0 to a.Length - 1 do
        if a.[i]
        then setArray (i + i) (a.Length - 1) i false
    a

要获取实际素数的列表，您只需映射索引，过滤生成的数组：

sieveArray limit
|> Seq.mapi (fun i x -> i, x)
|> Seq.filter snd
|> Seq.map fst
|> Seq.toList

Eratosthenes的F＃筛选的.NET优化

3 个答案: