我正在寻找一种高效(空间和时间)数据类型,它可以保存384位向量,并支持高效的XOR和“位数”(位数设置为1)操作。
下面,请找到我的演示程序。我需要的操作都在SOQuestionOps
类型类中,我已经为Natural
和Data.Vector.Unboxed.Bit
实现了它。特别是后者看起来很完美,因为它具有zipWords
操作,这应该允许我逐字逐句地执行“位计数”和XOR操作。它还声称存储打包的位(每字节8位)。
{-# LANGUAGE FlexibleInstances #-}
import Data.Bits
import Data.List (foldl')
import Numeric.Natural
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed.Bit as BV
class SOQuestionOps a where
soqoXOR :: a -> a -> a
soqoBitCount :: a -> Int
soqoFromList :: [Bool] -> a
alternating :: Int -> [Bool]
alternating n =
let c = n `mod` 2 == 0
in if n == 0
then []
else c : alternating (n-1)
instance SOQuestionOps Natural where
soqoXOR = xor
soqoBitCount = popCount
soqoFromList v =
let oneIdxs = map snd $ filter fst (zip v [0..])
in foldl' (\acc n -> acc `setBit` n) 0 oneIdxs
instance SOQuestionOps (BV.Vector BV.Bit) where
soqoXOR = BV.zipWords xor
soqoBitCount = BV.countBits
soqoFromList v = BV.fromList (map BV.fromBool v)
main =
let initialVec :: BV.Vector BV.Bit
initialVec = soqoFromList $ alternating 384
lotsOfVecs = V.replicate 10000000 (soqoFromList $ take 384 $ repeat True)
xorFolded = V.foldl' soqoXOR initialVec lotsOfVecs
sumBitCounts = V.foldl' (\n v -> n + soqoBitCount v) 0 lotsOfVecs
in putStrLn $ "folded bit count: " ++ show (soqoBitCount xorFolded) ++ ", sum: " ++ show sumBitCounts
因此,让我们计算最佳情况的数字:lotsOfVecs
不需要分配太多,因为它只是相同向量initialVec
的10,000,000倍。 foldl显然在每次折叠操作中创建了这些向量之一,因此它应该创建10,000,000位向量。位计数应创建除10,000,000 Int
s之外的任何内容。所以在最好的情况下,我的程序应该使用非常少(和常量)的内存,总分配大致应该是10,000,000 * sizeof(位向量)+ 10,000,000 * sizeof(int)= 520,000,000字节。
好的,让我们运行Natural
的程序:
让我们initialVec :: Natural
,用
ghc --make -rtsopts -O3 MemStuff.hs
结果(这是GHC 7.10.1):
$ ./MemStuff +RTS -sstderr
folded bit count: 192, sum: 3840000000
1,280,306,112 bytes allocated in the heap
201,720 bytes copied during GC
80,106,856 bytes maximum residency (2 sample(s))
662,168 bytes maximum slop
78 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 2321 colls, 0 par 0.056s 0.059s 0.0000s 0.0530s
Gen 1 2 colls, 0 par 0.065s 0.069s 0.0346s 0.0674s
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.579s ( 0.608s elapsed)
GC time 0.122s ( 0.128s elapsed)
EXIT time 0.000s ( 0.002s elapsed)
Total time 0.702s ( 0.738s elapsed)
%GC time 17.3% (17.3% elapsed)
Alloc rate 2,209,576,763 bytes per MUT second
Productivity 82.7% of total user, 78.7% of total elapsed
real 0m0.754s
user 0m0.704s
sys 0m0.037s
有1,280,306,112 bytes allocated in the heap
,这是预期数字的大概(2x)。顺便说一下GHC 7.8这会分配353,480,272,096个字节并运行绝对年龄,因为popCount
在GHC 7.8的Natural
s上效率不高。
编辑:我稍微更改了代码。在原始版本中,折叠中的每个其他向量都是0
。这为Natural
版本提供了更好的分配数据。我改变它,因此向量在不同的表示之间交替(设置了许多位),现在我们看到预期的2x
分配。这是Natural
(和Integer
)的另一个缺点:分配率取决于值。
但也许我们可以做得更好,让我们尝试密集的Data.Vector.Unboxed.Bit
:
那是initialVec :: BV.Vector BV.Bit
并使用相同的选项重新编译并重新运行。
$ time ./MemStuff +RTS -sstderr
folded bit count: 192, sum: 1920000000
75,120,306,536 bytes allocated in the heap
54,914,640 bytes copied during GC
80,107,368 bytes maximum residency (2 sample(s))
664,128 bytes maximum slop
78 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 145985 colls, 0 par 0.543s 0.627s 0.0000s 0.0577s
Gen 1 2 colls, 0 par 0.065s 0.070s 0.0351s 0.0686s
INIT time 0.000s ( 0.000s elapsed)
MUT time 27.679s ( 28.228s elapsed)
GC time 0.608s ( 0.698s elapsed)
EXIT time 0.000s ( 0.002s elapsed)
Total time 28.288s ( 28.928s elapsed)
%GC time 2.1% (2.4% elapsed)
Alloc rate 2,714,015,097 bytes per MUT second
Productivity 97.8% of total user, 95.7% of total elapsed
real 0m28.944s
user 0m28.290s
sys 0m0.456s
这非常缓慢,大约是分配的100倍:(。
好的,然后让我们重新编译并分析两次运行(ghc --make -rtsopts -O3 -prof -auto-all -caf-all -fforce-recomp MemStuff.hs
):
Natural
版本:
COST CENTRE MODULE %time %alloc
main.xorFolded Main 51.7 76.0
main.sumBitCounts.\ Main 25.4 16.0
main.sumBitCounts Main 12.1 0.0
main.lotsOfVecs Main 10.4 8.0
Data.Vector.Unboxed.Bit
版本:
COST CENTRE MODULE %time %alloc
soqoXOR Main 96.7 99.3
main.sumBitCounts.\ Main 1.9 0.2
Natural
真的是固定大小位向量的最佳选择吗? GHC 6.8怎么样?有没有更好的方法可以实现我的SOQuestionOps
类型类?
答案 0 :(得分:1)
查看/discounts/:discount_id and you
包中的store:req.params.store
模块:
http://hackage.haskell.org/package/Crypto-4.2.5.1/docs/Data-LargeWord.html
它为各种大小的大字提供Data.LargeWord
个实例,例如96到256位。