我开始学习haskell和远程delta压缩。我的第一步是在haskell中实现rsync的rolling checksum版本。那些公式中的块是否等于X(i)
?如果是这样,我会感到困惑。
Word8
的数组转换为那个大块。 Word32768
?我的意思是如果X(i)
列出了Word8
s?unsigned int
进行算术运算?此外,我当前的实现版本每个只滑动1B(Word8)。
答案 0 :(得分:2)
使用ByteString
很容易将[Word8]
变成unpack
,这应该足以执行此算法(尽管不一定最有效)
为什么需要将Word8
转换为Word32768
?为什么需要2^15
位数?这可能很难表示,但您可以使用Word8
的列表或数组,这很容易在内存中表示并且是等效的。
为了执行算术,map
,zipWith
,fold
和scan
等函数非常有用。例如,执行算法的第一步:
import qualified Data.ByteString as BS
a :: Int -> Int -> ByteString -> Int
a k l x
= (`mod` m)
$ sum
$ map fromIntegral
$ take (l - k)
$ drop k
$ BS.unpack x
where m = 2 ^ 16
实现函数b
只是稍微困难一些,您只需要计算l - i + 1
到i = k
的{{1}}序列,然后使用{{1在l
和zipWith (*)
之间。在此之后,实施map fromIntegral
非常简单,但如果您将take (l - k)
的常见步骤分解出来,它肯定可以更有效地执行。
答案 1 :(得分:2)
在所提供的链接中的等式/公式中,块不等于X(i)。它主要与Data Deduplication
有关。此外,rolling checksum
可用于创建块,识别块边界等。
此外,我目前对rsync的滚动校验和的实现如下。接下来我将实现循环多项式滚动校验和,然后阅读Data Deduplication
import qualified Data.ByteString.Lazy as B
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Word
import Data.Bits
import Data.Int
type CheckSumPartial = Word16
type CheckSumA = CheckSumPartial
type CheckSumB = CheckSumPartial
type WindowSize = Int64
type CheckSum = Word32
type Byte = Word8
main:: IO ()
main = do
let str = B8.pack "abcdef"
let s1 = roll 3 str
let s2 = withoutRoll 3 str
print s1
print s2
return ()
roll :: WindowSize -> B.ByteString -> [CheckSum]
roll w str =
let
(a,b,s) = newABS w str
h = B.head str
t = B.tail str
in if fromIntegral (B.length t) < w
then [s]
else s : rollNext w t h a b
withoutRoll :: WindowSize -> B.ByteString -> [CheckSum]
withoutRoll w str =
let
(_,_,s) = newABS w str
t = B.tail str
in if fromIntegral (B.length t) < w
then [s]
else s : withoutRoll w t
newA :: WindowSize -> B.ByteString -> CheckSumA
newA w str =
let block = B.take w str
in B.foldr aSum (0::CheckSumA) block
where
aSum x acc = acc + (fromIntegral x :: CheckSumA)
newB :: WindowSize -> B.ByteString -> CheckSumB
newB w str =
let block = B.take w str
in fst $ B.foldr bSum (0::CheckSumB, w) block
where
bSum x (acc,l) = (acc + fromIntegral l * (fromIntegral x :: CheckSumB), l-1)
rollA :: CheckSumA -> Byte -> Byte -> CheckSumA
rollA prevA prevHead curLast = prevA - fromIntegral prevHead + fromIntegral curLast
rollB :: CheckSumA -> Byte -> WindowSize -> CheckSumB -> CheckSumB
rollB curA prevHead w prevB = prevB - fromIntegral w * fromIntegral prevHead + curA
calculateS :: CheckSumA -> CheckSumB -> CheckSum
calculateS a b = (fromIntegral a :: Word32) .|. shift (fromIntegral b :: Word32) 16
rollNext :: WindowSize ->B.ByteString -> Byte -> CheckSumA -> CheckSumB -> [CheckSum]
rollNext w str prevHead prevA prevB =
let
curBlock = B.take (fromIntegral w) str
curLast = B.last curBlock
h = B.head str
t = B.tail str
a = rollA prevA prevHead curLast
b = rollB a prevHead w prevB
s = calculateS a b
in if fromIntegral (B.length t) < w
then [s]
else s : rollNext w t h a b
newABS :: WindowSize -> B.ByteString -> (CheckSumA, CheckSumB, CheckSum)
newABS w str =
let a = newA w str
b = newB w str
s = calculateS a b
in (a,b,s)