Question

我的功能如下：

millionsOfCombinations = [[a, b, c, d] | 
  a <- filter (...some filter...) someListOfAs, 
  b <- (...some other filter...) someListOfBs, 
  c <- someListOfCs, d <- someListOfDs]

aLotOfCombinationsOfCombinations = [[comb1, comb2, comb3] | 
  comb1 <- millionsOfCombinations, 
  comb2 <- millionsOfCombinations,
  comb3 <- someList,
  ...around 10 function calls to find if
    [comb1, comb2, comb3] is actually useful]

评估millionsOfCombinations需要40秒。在一个非常快的工作站上。评估aLotOfCombinationsOfCombinations !! 0需要2天时间： - （

如何加快此代码的速度？到目前为止，我有两个想法 - 使用分析器。在使用GHC编译后尝试运行myapp +RTS -sstderr，但得到一个空白的屏幕，并且不想等待几天完成。

第二个想法是以某种方式缓存millionsOfCombinations。我是否理解为aLotOfCombinationsOfCombinations中的每个值，millionsOfCombinations多次评估？如果是这样，我该如何缓存结果？显然我刚刚开始学习Haskell。我知道有一种方法可以使用monad进行调用缓存，但我仍然不理解这些内容。

Answer 1

使用-fforce-recomp，-O2和-fllvm标记

如果您还没有，请务必使用上述标志。我通常不会提到它，但我最近看到一些不知道强大优化的问题不是默认值。

描述您的代码

-sstderr标志不是完全分析。当人们说分析时，他们通常会通过-prof and -auto-all标志来讨论堆分析或时间分析。

避免昂贵的原语

如果您需要内存中的整个列表（即它不会被优化掉），那么请考虑未装箱的向量。如果Int代替Integer，请考虑（但是当您不知道时，Integer是合理的默认值！）。在正确的时间使用工人/包装变换。如果您非常依赖Data.Map，请尝试使用无序容器库中的Data.HashMap。这个列表可以继续，但由于你还没有对计算时间的直觉，因此应首先进行分析！

Answer 2

我认为，没有办法。请注意，生成列表的时间随着每个列表的增加而增长。所以你得到大约1000000 ³组合来检查，这确实需要很多时间。缓存列表是可能的，但不太可能改变任何东西，因为几乎可以立即生成新元素。唯一的方法是改变算法。

如果millionsOfCombinations是常量（而不是带参数的函数），则会自动缓存它。否则，使用where子句使其成为常量：

aLotOfCombinationsOfCombinations = [[comb1, comb2, comb3] | 
  comb1 <- millionsOfCombinations, 
  comb2 <- millionsOfCombinations,
  comb3 <- someList,
  ...around 10 function calls to find if
    [comb1, comb2, comb3] is actually useful] where

  millionsOfCombinations = makeCombination xyz

Haskell - 缓存函数调用的简单方法

2 个答案: