Question

我想了解一下编写内存高效的haskell代码。我碰到的一件事是，没有简单的方法来制作python样式列表生成器/迭代器（我能找到）。

小例子：

在不使用闭合形式公式的情况下查找1到100000000之间的整数之和。

Python可以快速完成，只需最少的内存使用jp::String callback1(const jp::NumSub& m, void*, void*){ return "("+m[0]+")"; //m[0] is capture group 0, i.e total match (in each match) } int main(){ jp::Regex re("(?<total>\\w+)", "n"); jp::RegexReplace rr(&re); String s3 = "I am ঋ আা a string 879879 fdsjkll ১ ২ ৩ ৪ অ আ ক খ গ ঘ আমার সোনার বাংলা"; rr.setSubject(s3) .setPcre2Option(PCRE2_SUBSTITUTE_GLOBAL); std::cout<<"\n\n### 1\n"<< rr.nreplace(jp::MatchEvaluator(callback1)); //nreplace() treats the returned string from the callback as literal, //while replace() will process the returned string //with pcre2_substitute() #if __cplusplus >= 201103L //example with lambda std::cout<<"\n\n### Lambda\n"<< rr.nreplace( jp::MatchEvaluator( [](const jp::NumSub& m1, const jp::MapNas& m2, void*){ return "("+m1[0]+"/"+m2.at("total")+")"; } )); #endif return 0; }。在Haskell中，模拟将是sum(xrange(100000000)。然而，这耗尽了大量内存。我认为使用sum [1..100000000]或foldl会很好，但即使使用大量内存并且比python慢。有什么建议吗？

Answer 1

TL; DR - 我认为在这种情况下的罪魁祸首是 - 将GHC默认为`Integer`。

不可否认我对python知之甚少，但我的第一个猜测是python只在必要时切换到“bigint” - 因此所有计算都是在我的机器上用Int 64位整数完成的。 / p>

使用

进行首次检查

$> ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
Prelude> maxBound :: Int
9223372036854775807

显示总和（5000000050000000）的结果小于该数字，因此我们不必担心Int溢出。

我猜你的示例程序大致看起来像这样

`sum.py`

print(sum(xrange(100000000)))

`sum.hs`

main :: IO ()
main = print $ sum [1..100000000]

为了明确说明，我添加了类型注释(100000000 :: Integer)，并使用

进行编译

$ > stack build --ghc-options="-O2 -with-rtsopts=-sstderr"

然后运行你的例子，

$ > stack exec -- time sum
5000000050000000
   3,200,051,872 bytes allocated in the heap
         208,896 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6102 colls,     0 par    0.013s   0.012s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    1.725s  (  1.724s elapsed)
  GC      time    0.013s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.739s  (  1.736s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,855,603,449 bytes per MUT second

  Productivity  99.3% of total user, 99.4% of total elapsed

1.72user 0.00system 0:01.73elapsed 99%CPU (0avgtext+0avgdata 4112maxresident)k

确实会重现~3GB的内存消耗。

将注释更改为(100000000 :: Int) - 彻底改变了行为

$ > stack build
$ > stack exec -- time sum
5000000050000000
          51,872 bytes allocated in the heap
           3,408 bytes copied during GC
          44,312 bytes maximum residency (1 sample(s))
          17,128 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.034s  (  0.034s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.036s  (  0.035s elapsed)

  %GC     time       0.2%  (0.2% elapsed)

  Alloc rate    1,514,680 bytes per MUT second

  Productivity  99.4% of total user, 102.3% of total elapsed

0.03user 0.00system 0:00.03elapsed 91%CPU (0avgtext+0avgdata 3496maxresident)k
0inputs+0outputs (0major+176minor)pagefaults 0swaps

感兴趣的

如果您使用conduit或vector（包装盒和非盒装）等库，则haskell版本的行为不会发生太大变化。

示例程序

sumC.hs

import Data.Conduit
import Data.Conduit.List as CL

main :: IO ()
main = do res <- CL.enumFromTo 1 100000000 $$ CL.fold (+) (0 :: Int)
          print res

sumV.hs

import           Data.Vector.Unboxed as V
{-import           Data.Vector as V-}

main :: IO ()
main = print $ V.sum $ V.enumFromTo (1::Int) 100000000

使用

的版本很有趣

main = print $ V.sum $ V.enumFromN (1::Int) 100000000

比上述情况更糟糕 - 即使documentation另有说法。

enumFromN :: (Unbox a, Num a) => a -> Int -> Vector a
O（n）产生给定长度的向量，其包含值x，x + 1 此操作通常比enumFromTo更有效。

更新

@ Carsten的评论让我很好奇 - 所以我仔细研究了整数的来源 - 好integer-simple，因为Integer存在其他版本integer-gmp和{{ 1}}使用integer-gmp2。

libgmp

所以当使用data Integer = Positive !Positive | Negative !Positive | Naught ------------------------------------------------------------------- -- The hard work is done on positive numbers -- Least significant bit is first -- Positive's have the property that they contain at least one Bit, -- and their last Bit is One. type Positive = Digits type Positives = List Positive data Digits = Some !Digit !Digits | None type Digit = Word# data List a = Nil | Cons a (List a)时，与Integer相比，存在相当多的内存开销，或者更确切地说是未装箱的Int - 我想这应该是优化的，（尽管我还没有确认）。

所以Int#是（如果我算得正确的话）

1-Integer为sum-type-tag（此处为Word
n x（Positive）代表Word + Word和Some部分
最后Digit

Word

该计算中每个None的（2 + floor（log_10（n））的内存开销+累加器的更多内容。

在Haskell中使用列表生成器来实现内存高效的代码

1 个答案:

TL; DR - 我认为在这种情况下的罪魁祸首是 - 将GHC默认为`Integer`。

`sum.py`

`sum.hs`

感兴趣的

更新

在Haskell中使用列表生成器来实现内存高效的代码

1 个答案:

TL; DR - 我认为在这种情况下的罪魁祸首是 - 将GHC默认为Integer。

sum.py

sum.hs

感兴趣的

更新

TL; DR - 我认为在这种情况下的罪魁祸首是 - 将GHC默认为`Integer`。

`sum.py`

`sum.hs`