通过删除无用的代码减慢速度(Project Euler 23)

时间:2015-01-15 21:51:18

标签: haskell optimization

我试图从Project Euler#23优化我的旧代码,并注意到一些奇怪的减速,同时在函数中删除无用的比较以进行列表合并。

我的代码:

import Data.List
import Debug.Trace

limit = 28123

-- sum of all integers from 1 to n
summe :: Int -> Int
summe n = div (n*(n+1)) 2

-- all divisors of x excluding itself
divisors :: Int -> [Int]
divisors x = l1 ++ [x `div` z | z <- l1, z*z /= x, z /= 1]
               where m = floor $ sqrt $ fromIntegral x
                     l1 = [y | y <- [1..m] , mod x y == 0]

-- list of all abundant numbers
liste :: [Int]
liste = [x | x <- [12..limit] , x < sum (divisors x)]

-- nested list with sums of abundent numbers
sumliste :: [[Int]]
sumliste = [[x+y | x <- takeWhile (<=y) liste, x + y <= limit] | y <- liste]

-- reduced list
rsl :: [[Int]] -> [Int]
rsl (hl:[]) = hl
rsl (hl:l) = mergelists hl (rsl l)

-- build a sorted union of two sorted lists
mergelists :: [Int] -> [Int] -> [Int]
mergelists [] [] = []
mergelists [] b = b
mergelists a [] = a
mergelists as@(a:at) bs@(b:bt)
--  | a == b = a : mergelists at bt
--  | a > b = b : mergelists as bt
--  | a < b = a : mergelists at bs
  | a == b = if a == hl1
               then trace "1" l1
               else a : l1
  | a > b = if b == hl2
              then trace "2" l2
              else b : l2
  | a < b = if a == hl3
              then trace "3" l3
              else a : l3
                where l1 = mergelists at bt
                      hl1 = if null l1 then a + 1 else head l1
                      l2 = mergelists as bt
                      hl2 = head l2
                      l3 = mergelists at bs
                      hl3 = head l3

-- build the sum of target numbers by subtracting sum of nontarget numbers from all numbers
main = print $ (summe limit) - (sum $ rsl sumliste)

我的问题是函数mergelists。这个函数的主体包含一些无用的if子句(可以通过缺少的跟踪输出看到),并且可以重构为三个注释行。问题是将执行时间从3.4秒增加到5.8秒,这是我无法理解的。

为什么较短的代码会变慢?

1 个答案:

答案 0 :(得分:2)

正如Thomas M. DuBuisson所说,问题与缺乏严格性有关。以下代码稍微修改了您已注释掉的代码,它使用$!运算符确保在形成列表之前评估mergelists调用。

mergelists :: [Int] -> [Int] -> [Int]
mergelists [] [] = []
mergelists [] b  = b
mergelists a  [] = a
mergelists as@(a:at) bs@(b:bt)
  | a == b = (a :) $! mergelists at bt
  | a >  b = (b :) $! mergelists as bt
  | a <  b = (a :) $! mergelists at bs

函数$!确保如果评估(_ :) $! mergelists _ _的结果,则还必须评估mergelists _ _。由于递归,这意味着如果评估mergelists的结果,则必须评估整个列表

在慢速版本中,

mergelists as@(a:at) bs@(b:bt)
  | a == b = a : mergelists at bt
  | a >  b = b : mergelists as bt
  | a <  b = a : mergelists at bs

您可以检查结果的第一个元素,而无需评估列表的其余部分。列表尾部对mergelists的调用存储为未评估的thunk。这有各种含义:

  • 如果您只需要合并列表的一小部分,或者输入无限长,那么这很好。
  • 另一方面,如果列表不是那么大并且/或者您最终需要所有元素,则由于存在thunk而增加了额外的开销。这也意味着垃圾收集器无法释放任何输入,因为thunk会保留对它们的引用。

我并不完全明白为什么它对你的特定问题来说会变慢 - 也许有经验的人可以对此有所了解。

我注意到,在-O0,&#34;慢速版本&#34;实际上是这三种方法中最快的,因此我怀疑GHC能够利用严格性并生成更优化的代码。