Question

Here is the full repository。这是一个非常简单的测试，它使用postgresql-simple数据库绑定将50000个随机内容插入到数据库中。它使用MonadRandom，可以懒散地生成东西。

Here is case1以及使用Thing生成器的特定代码片段：

insertThings c = do
  ts <- genThings
  withTransaction c $ do
    executeMany c "insert into things (a, b, c) values (?, ?, ?)" $ map (\(Thing ta tb tc) -> (ta, tb, tc)) $ take 50000 ts

Here is case2，只是把东西转储到标准输出：

main = do
  ts <- genThings
  mapM print $ take 50000 ts

在第一种情况下，我的GC时间非常糟糕：

cabal-dev/bin/posttest +RTS -s       
   1,750,661,104 bytes allocated in the heap
     619,896,664 bytes copied during GC
      92,560,976 bytes maximum residency (10 sample(s))
         990,512 bytes maximum slop
             239 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3323 colls,     0 par   11.01s   11.46s     0.0034s    0.0076s
  Gen  1        10 colls,     0 par    0.74s    0.77s     0.0769s    0.2920s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.97s  (  3.86s elapsed)
  GC      time   11.75s  ( 12.23s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   14.72s  ( 16.09s elapsed)

  %GC     time      79.8%  (76.0% elapsed)

  Alloc rate    588,550,530 bytes per MUT second

  Productivity  20.2% of total user, 18.5% of total elapsed

虽然在第二种情况下时间很棒：

cabal-dev/bin/dumptest +RTS -s > out
   1,492,068,768 bytes allocated in the heap
       7,941,456 bytes copied during GC
       2,054,008 bytes maximum residency (3 sample(s))
          70,656 bytes maximum slop
               6 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2888 colls,     0 par    0.13s    0.16s     0.0001s    0.0089s
  Gen  1         3 colls,     0 par    0.01s    0.01s     0.0020s    0.0043s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.00s  (  2.37s elapsed)
  GC      time    0.14s  (  0.16s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    2.14s  (  2.53s elapsed)

  %GC     time       6.5%  (6.4% elapsed)

  Alloc rate    744,750,084 bytes per MUT second

  Productivity  93.5% of total user, 79.0% of total elapsed

我试图应用堆分析，但什么都不懂。看起来所有50000 Thing都先在内存中构建，然后通过查询转换为ByteStrings，然后将这些字符串发送到数据库。但为什么会这样呢？我如何确定有罪的代码？

GHC版本为7.4.2

所有库和包本身的编译标志是-O2（由沙箱中的cabal-dev编译）

Answer 1

我已经使用formatMany和50k Things查看了个人资料。内存稳步增长，然后迅速下降。使用的最大内存略高于40mb。主要成本中心是buildQuery和escapeStringConn，后跟toRow。一半的数据是ARR_WORDS（字节字符串），动作和列表。

formatMany几乎可以从嵌套的Actions列表中组成一个长ByteString个。操作转换为ByteString构建器，保留ByteStrings直到用于生成最终的长严格ByteString。这些ByteStrings的寿命很长，直到建立最终的BS。

字符串需要使用libPQ进行转义，因此任何非Plain动作BS都会传递给libPQ并在escapeStringConn和friends中替换为新的动作，添加更多垃圾。如果将文本中的文本替换为另一个Int，GC时间将从75％下降到45％。

我试图通过formatMany和buildQuery减少临时列表的使用，用Builder上的foldM替换mapM。它没有多大帮助，但稍微增加了代码复杂性。

TLDR - Builders不能被懒惰地消耗，因为所有这些都需要产生最终严格ByteString（几乎是字节数组）。如果内存有问题，请将executeMany拆分为同一事务中的块。

为什么这段代码消耗了这么多堆？

1 个答案: