Question

我正在使用zlib压缩字符串，使用 9 来压缩字典，这是脚本：

import zlib __doc__ = """ rly this small py em ell thon... !? ha he hahaha hehehe h4ck3r wassup how ar u let's go lets go common c'mon...!!! ad """ class Compressor(object): def __init__(self, seed): c = zlib.compressobj(9) d_seed = c.compress(seed) d_seed += c.flush(zlib.Z_SYNC_FLUSH) self.c_context = c.copy() d = zlib.decompressobj() d.decompress(d_seed) while d.unconsumed_tail: d.decompress(d.unconsumed_tail) self.d_context = d.copy() def compress(self, text): c = self.c_context.copy() t = c.compress(text) t2 = c.flush(zlib.Z_FINISH) return t + t2 def decompress(self, ctext): d = self.d_context.copy() t = d.decompress(ctext) while d.unconsumed_tail: t += d.decompress(d.unconsumed_tail) return t c = Compressor(__doc__) compressed_string = c.compress(string) print compressed_string string = "I installed python, and it's very nice and easy to use!" compressed = c.compress(string) print c.decompress(compressed)

我的问题是：从结果compressed_string如何删除额外的'袖带'，例如标题和最后4个ADLER字节，然后在解压缩时添加？ 如果我在某些定义中出错，请纠正我。

与12h3d78e23gdh278qs98qwjsj89qs1234一样（其中12是理论上的2字节标头，而1234是字符串末尾的理论ADLER部分）变为h3d78e23gdh278qs98qwjsj89qs并且然后在zlib必须使用类似
之类的东西进行解压缩时重建
to_decompress = '12',compressed,'1234' c.decompress(to_decompress)

Answer 1

您已经有效地剥离了标头，因为它位于d_seed的前两个字节中。您不需要也不应该删除Adler-32检查，因为如果您正确地重建压缩流，Adler-32将提供完整性检查。

您只需将d_seed添加到传输的数据中，然后正常解压缩结果。

如果您使用的是Python 3.3或更高版本，更好的方法是使用zlib为您执行字典操作。 zlib.compressobj()的最后一个参数和zlib.decompressobj()的最后一个参数可以是字典。然后，您还可以对提供的字典进行完整性检查，以帮助验证您是否在压缩端使用的解压缩端提供相同的字典。然后，该流是使用字典的标准zlib流，因此更易于移植和识别。

如果你真的想要挤出一个可怜的六个字节（不推荐，因为你丢失了完整性检查），或者实际上只有四个字节，因为你已经剥离了标题，那么使用-15作为{{1} } wbits的参数。这将抑制zlib头和预告片。（也需要3.3。）

如何从zlib压缩字符串中删除标头？

1 个答案: