你的程序没有你想象的那么慢......

Question

我正在尝试euler challenge 14。我想知道我是否可以在haskell中快速计算它。我试过这种天真的方法。




   import Data.List
 import Data.Function
 collatz n | even n = n quot 2
           | otherwise = 3*n+1
 
 colSeq = takeWhile (/= 1) . (iterate collatz)
 
 main=print $ maximumBy (compare on (length . colSeq)) [1..999999]

quot

但这花了太长时间。

on

我还尝试使用反向折叠关系，并在地图中保留长度以消除冗余计算，但这也没有用。并且不想要解决方案，但是有没有人有一些数学文献或编程技术可以让这个更快，或者我只需要让它过夜？

Answer 1

你的程序没有你想象的那么慢......

首先，如果使用-O2进行编译并增加堆栈大小（我使用+RTS -K100m，但您的系统可能会有所不同），您的程序运行正常并在两分钟内完成：

$ .\collatz.exe +RTS -K100m -s
  65,565,993,768 bytes allocated in the heap
  16,662,910,752 bytes copied during GC
      77,042,796 bytes maximum residency (1129 sample(s))
       5,199,140 bytes maximum slop
             184 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     124724 colls,     0 par   18.41s   18.19s     0.0001s    0.0032s
  Gen  1      1129 colls,     0 par   16.67s   16.34s     0.0145s    0.1158s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   39.98s  ( 41.17s elapsed)
  GC      time   35.08s  ( 34.52s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   75.06s  ( 75.69s elapsed)

  %GC     time      46.7%  (45.6% elapsed)

  Alloc rate    1,639,790,387 bytes per MUT second

  Productivity  53.3% of total user, 52.8% of total elapsed

......但那仍然很慢

生产率约为50％意味着GC使用的时间是我们盯着屏幕的一半时间，等待我们的结果。在我们的例子中，我们通过迭代每个值的序列来创建大量垃圾。

改进

Collatz序列是一个递归序列。因此，我们应该将其定义为递归序列而不是迭代序列，并查看发生的情况。

colSeq 1 = [1]
colSeq n 
  | even n    = n : colSeq (n `div` 2) 
  | otherwise = n : colSeq (3 * n + 1)

Haskell中的列表是一种基本类型，因此GHC应该有一些漂亮的优化（-O2）。所以试试吧：

结果

$ .\collatz_rec.exe +RTS -s
  37,491,417,368 bytes allocated in the heap
       4,288,084 bytes copied during GC
          41,860 bytes maximum residency (2 sample(s))
          19,580 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     72068 colls,     0 par    0.22s    0.22s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   32.89s  ( 33.12s elapsed)
  GC      time    0.22s  (  0.22s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   33.11s  ( 33.33s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,139,881,573 bytes per MUT second

  Productivity  99.3% of total user, 98.7% of total elapsed

请注意，我们现在在~80％的MUT时间内（与原始版本相比）的生产率高达99％。只是通过这个小小的改变，我们大大减少了运行时间。

等等，还有更多！

有一件事很奇怪。为什么我们计算的长度 1024和512？毕竟，后者无法创建更长的Collatz序列。

改进

然而，在这种情况下，我们必须将问题视为一项重大任务，而不是地图。我们需要跟踪已经计算的值，并且我们希望清除那些已经访问过的值。

我们使用Data.Set：

problem_14 :: S.Set Integer -> [(Integer, Integer)]
problem_14 s
  | S.null s  = []
  | otherwise = (c, fromIntegral $ length csq) : problem_14 rest
  where (c, rest') = S.deleteFindMin s
        csq        = colSeq c
        rest       = rest' `S.difference` S.fromList csq

我们这样使用problem_14：

main = print $ maximumBy (compare `on` snd) $ problem_14 $ S.fromList [1..999999]

结果

$ .\collatz_set.exe +RTS -s
  18,405,282,060 bytes allocated in the heap
   1,645,842,328 bytes copied during GC
      27,446,972 bytes maximum residency (40 sample(s))
         373,056 bytes maximum slop
              79 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     35193 colls,     0 par    2.17s    2.03s     0.0001s    0.0002s
  Gen  1        40 colls,     0 par    0.84s    0.77s     0.0194s    0.0468s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   14.91s  ( 15.17s elapsed)
  GC      time    3.02s  (  2.81s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   17.92s  ( 17.98s elapsed)

  %GC     time      16.8%  (15.6% elapsed)

  Alloc rate    1,234,735,903 bytes per MUT second

  Productivity  83.2% of total user, 82.9% of total elapsed

我们放松了一些生产力，但这是合理的。毕竟，我们现在正在使用Set而不再使用列表，而是使用79MB而不是1MB。但是，我们的程序现在运行时间为17秒而不是34秒，这只是原始时间的25％。

使用`ST`

灵感（C ++）

int main(){
  std::vector<bool> Q(1000000,true);

  unsigned long long max_l = 0, max_c = 1;

  for(unsigned long i = 1; i < Q.size(); ++i){
    if(!Q[i])
      continue;
    unsigned long long c = i, l = 0;
    while(c != 1){
      if(c < Q.size()) Q[c] = false;
      c = c % 2 == 0 ? c / 2 : 3 * c + 1;
      l++;
    }
    if(l > max_l){
      max_l = l;
      max_c = i;
    }
  }
  std::cout << max_c << std::endl;
}

该程序运行130毫秒。我们最好的版本需要100倍以上。我们可以解决这个问题。

的Haskell

problem_14_vector_st :: Int -> (Int, Int)
problem_14_vector_st limit = 
  runST $ do
    q <- V.replicate (limit+1) True    
    best <- newSTRef (1,1)
    forM_ [1..limit] $ \i -> do
      b <- V.read q i
      when b $ do
        let csq = colSeq $ fromIntegral i
        let l   = fromIntegral $ length csq
        forM_ (map fromIntegral csq) $ \j-> 
          when (j<= limit && j>= 0) $  V.write q j False
        m <- fmap snd $ readSTRef best
        when (l > m) $ writeSTRef best (i,l)
    readSTRef best

结果

$ collatz_vector_st.exe +RTS -s
   2,762,282,216 bytes allocated in the heap
      10,021,016 bytes copied during GC
       1,026,580 bytes maximum residency (2 sample(s))
          21,684 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5286 colls,     0 par    0.02s    0.02s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    3.09s  (  3.08s elapsed)
  GC      time    0.02s  (  0.02s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    3.11s  (  3.11s elapsed)

  %GC     time       0.5%  (0.7% elapsed)

  Alloc rate    892,858,898 bytes per MUT second

  Productivity  99.5% of total user, 99.6% of total elapsed

~3秒。其他人可能知道更多技巧，但这是我从Haskell中挤出来的最多。

Answer 2

缓存您已经击中的整数值将为您节省大量时间。如果您输入数字1234，并找到需要273步才能达到1，请关联这些值。 1234-＆GT; 273

现在，如果您按顺序点击1234，则不必再执行273步骤来查找答案，只需将273添加到当前数字，即可知道序列的长度。

对您计算的每个数字执行此操作，即使是序列中间的数字。例如，如果您在1234并且还没有值，则执行步骤（除以2）并计算并缓存617的值。您可以通过这种方式快速缓存几乎所有重要值。有一些非常长的链条会一次又一次地结束。

随时缓存所有值的最简单方法是创建递归函数。像这样（伪代码）：

function collatz(number) {
    if number is 1: return 1
    else if number is in cache: return cached value
    else perform step: newnumber = div 2 if even, time 3 + 1 if odd
    steps = collatz(newnumber) + 1 //+1 for the step we just took
    cache steps as the result for number
    return steps
}

希望Haskell不会遇到像这样最终会出现的递归深度问题。但是，它haskell不喜欢它，你可以用堆栈实现相同的东西，它只是不太直观。

Answer 3

时间和内存问题的主要来源是你构建整个Collatz序列，而对于任务你只需要他们的长度，不幸的是懒惰不能挽救一天。只计算长度的简单解决方案在几秒钟内完成：

simpleCol :: Integer -> Int
simpleCol 1 = 1
simpleCol x | even x = 1 + simpleCol (x `quot` 2)
            | otherwise = 1 + simpleCol (3 * x + 1)

problem14 = maximum $ map simpleCol [1 .. 999999]

它还需要更少的内存，不需要放大堆栈：

$> ./simpleCollatz +RTS -s
simpleCollatz +RTS -s
   2,517,321,124 bytes allocated in the heap
         217,468 bytes copied during GC
          41,860 bytes maximum residency (2 sample(s))
          19,580 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      4804 colls,     0 par    0.00s    0.02s     0.0000s    0.0046s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    4.47s  (  4.49s elapsed)
  GC      time    0.00s  (  0.02s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    4.47s  (  4.52s elapsed)

  %GC     time       0.0%  (0.5% elapsed)

  Alloc rate    563,316,615 bytes per MUT second

  Productivity 100.0% of total user, 98.9% of total elapsed

为了说明使用缓存的建议解决方案，有一种称为memoization的漂亮技术。可以说最简单的方法是安装memoize包：

import Data.Function.Memoize

memoCol :: Integer -> Int
memoCol = memoFix mc where
    mc _ 1 = 1
    mc f x | even x = 1 + f (x `quot` 2)
           | otherwise = 1 + f (3 * x + 1)

这减少了运行时和内存使用量，但也大量使用GC来维护缓存值：

$> ./memoCollatz +RTS -s
memoCollatz +RTS -s
   1,577,954,668 bytes allocated in the heap
   1,056,591,780 bytes copied during GC
     303,942,300 bytes maximum residency (12 sample(s))
         341,468 bytes maximum slop
             616 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3003 colls,     0 par    1.11s    1.19s     0.0004s    0.0010s
  Gen  1        12 colls,     0 par    3.48s    3.65s     0.3043s    1.7065s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    7.55s  (  7.50s elapsed)
  GC      time    4.59s  (  4.84s elapsed)
  EXIT    time    0.00s  (  0.05s elapsed)
  Total   time   12.14s  ( 12.39s elapsed)

  %GC     time      37.8%  (39.1% elapsed)

  Alloc rate    209,087,160 bytes per MUT second

  Productivity  62.2% of total user, 60.9% of total elapsed

Answer 4

确保使用Integer而不是Int，因为Int32溢出会导致递归问题。

collatz :: Integer -> Integer

项目Euler＃14在Haskell中的提示？

4 个答案:

你的程序没有你想象的那么慢......

......但那仍然很慢

改进

结果

等等，还有更多！

改进

结果

使用`ST`

灵感（C ++）

的Haskell

结果

项目Euler＃14在Haskell中的提示？

4 个答案:

你的程序没有你想象的那么慢......

......但那仍然很慢

改进

结果

等等，还有更多！

改进

结果

使用ST

灵感（C ++）

的Haskell

结果

使用`ST`