我有一个文件,其中包含许多zlib档案。 该文件的结构如下所示:
+-------------------------+
|+-----------------------+|
|| CMF+FLG (78DA) ||
|+-----------------------+|
|+-----------------------+|
||...compressed data 1...||
|+-----------------------+|
|+-----------------------+|
|| ADLER32 ||
|+-----------------------+|
|
|+-----------------------+|
|| CMF+FLG (78DA) ||
|+-----------------------+|
|+-----------------------+|
||...compressed data 2...||
|+-----------------------+|
|+-----------------------+|
|| ADLER32 ||
|+-----------------------+|
|
|+-----------------------+|
|| CMF+FLG (78DA) ||
|+-----------------------+|
|+-----------------------+|
||...compressed data 3...||
|+-----------------------+|
|+-----------------------+|
|| ADLER32 ||
|+-----------------------+|
| |
|.........................|
| |
|+-----------------------+|
|| CMF+FLG (78DA) ||
|+-----------------------+|
|+-----------------------+|
||...compressed data n...||
|+-----------------------+|
|+-----------------------+|
|| ADLER32 ||
|+-----------------------+|
+-------------------------+
我需要遍历所有这些存档并提取它们。 我尝试了以下代码,但这只从文件中提取了第一个存档。
for filename in sys.argv[1:]:
with open(filename, 'r') as compressed:
with open(filename + '-decompressed', 'w') as expanded:
data = zlib.decompress(compressed.read())
expanded.write(data)
答案 0 :(得分:2)
您应该可以使用一系列解压缩对象:
import zlib
with open(filename, 'rb') as compressed:
data = compressed.read()
file_no = 0
while data:
d = zlib.decompressobj()
with open('{}_decompressed.{}'.format(filename, file_no), 'wb') as f:
f.write(d.decompress(data))
data = d.unused_data
file_no += 1
这将包含一个包含多个串联zlib压缩文件的文件,并将每个压缩文件解压缩为一个单独的文件,并将" _decompressed.n" 附加到原始文件名。
关键是使用解压缩对象的unused_data
属性来确定字符串中是否还有未压缩的数据。