python从7z文件中提取未压缩的数据

时间:2016-05-25 09:12:54

标签: python 7zip lzma

我有几个csv文件,其中一些是压缩的,但有些则不是,都在7z存档中。我想阅读csv文件并将内容保存在数据库中。但是,每当py7zlib尝试从实际上未压缩的csv文件中读取数据时,我都会收到错误data error during decompression

import os
import py7zlib

scr = r'Y:\PathtoArchive'
z7file = 'ArchiveName.7z'

with open(os.path.join(scr,z7file),'rb') as f:
    archive = py7zlib.Archive7z(f)

    names = archive.filenames

    for mem in names:

        obj = archive.getmember(mem)
        print obj.compressed  # prints None for uncompressed data
        try:
            data = obj.read()
        except Exception as er:
            print er          # prints data error during decompression
                              # whenever obj.compressed is None

错误发生在

File "C:\Anaconda\lib\site-packages\py7zlib.py", line 608, in read
data = getattr(self, decoder)(coder, data, level)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 671, in _read_lzma
return self._read_from_decompressor(coder, dec, input, level, checkremaining=True, with_cache=True)
File "C:\Anaconda\lib\site-packages\py7zlib.py", line 646, in _read_from_decompressor
tmp = decompressor.decompress(data)
ValueError: data error during decompression

那么,如何从7z-Archive中提取未压缩的数据?

2 个答案:

答案 0 :(得分:2)

虽然我无法弄清楚问题似乎是什么,但我找到了一个解决方案,解决了从7z-archive中获取csv文件数据的最终目标。 7-zip附带命令行工具。通过子进程模块与该工具通信,我可以自动提取我想要提取的文件而没有任何问题

import subprocess
import py7zlib 

archiveman = r'c:\Program Files\7-zip\7z' # 7z.exe comes with 7-zip
archivepath = r'C:\Path\to\archive.7z'

with open(archivepath,'rb') as f:
    archive = py7zlib.Archive7z(f)
    names = archive.filenames
    for name in names:
        _ = subprocess.check_output([archiveman, 'e', archivepath, '-o{}'.format(r'C:\Destination\of\copy'), name])

可以找到可与7z一起使用的不同命令here

答案 1 :(得分:1)

您可以尝试使用另一个库py7zr,该库还支持7zip存档压缩,解压缩,加密和解密。 https://pypi.org/project/py7zr