使用Python从许多ZIP存档中提取具有特定扩展名的文件

时间:2017-09-07 13:04:39

标签: python

我是Python的新手,似乎无法让这件事发挥作用。 下面的代码能够找到ZIP文件并准确提取我的内容 只要文件夹中只有一个ZIP文件就可以。这个问题似乎随着zipfile.ZipFile从第一个函数读取返回值作为一个大字符串而不是作为路径读取而且我用尽了如何解决这个问题。

import zipfile
import os
import fnmatch


def archive1():
    myarchive = []
    rootPath= (r'E:\Test\2017')
    pattern = '*.zip'
    for root, dirs, files in os.walk(rootPath):
         for filename in fnmatch.filter(files,pattern):
            zipfile.ZipFile(os.path.join(root, filename))
            myarchive.append(os.path.join(root,filename))
    return str(myarchive).replace('[',"").replace(']',"").replace('"',"").replace("'","")
    #this is here so function returns as string and replace characters so second function reads it as applicable path(s).             

if __name__ == '__main__':
    archive1()

myarchive1 = archive1()

def extractor():
    new_dr = r'E:\Test'
    extensions = ('.txt','.pdf')
    zip_file = zipfile.ZipFile(myarchive1)
    print (zip_file)
    [zip_file.extract(file,new_dr) for file in zip_file.namelist() if file.endswith(extensions)]
    zip_file.close()

if __name__ == '__main__':
    extractor()

我明白了:

Traceback (most recent call last):
  File "e:\VSC_Folder\totalni_test.py", line 30, in <module>
    extractor()
  File "e:\VSC_Folder\totalni_test.py", line 24, in extractor
    zip_file = zipfile.ZipFile(myarchive1)
  File "C:\Users\Thiothixene\AppData\Local\Programs\Python\Python36-
32\lib\zipfile.py", line 1090, in __init__
    self.fp = io.open(file, filemode)
 OSError: [Errno 22] Invalid argument: 'E:\\\\Test\\\\2017\\\\test2.zip, 
   E:\\\\Test\\\\2017\\\\ZG.zip'

1 个答案:

答案 0 :(得分:0)

只需将zipfile对象作为参数传递给提取器。您不应该尝试从列表的字符串表示中解析文件路径 - 这很可能是导致问题的原因。尝试类似:

import zipfile
import os
import fnmatch

def archive1():
    rootPath= (r'E:\Test\2017')
    pattern = '*.zip'
    for root, dirs, files in os.walk(rootPath):
         for filename in fnmatch.filter(files,pattern):
            with zipfile.ZipFile(os.path.join(root, filename)) as zf:
                extractor(zf)

def extractor(zip_file):
    new_dr = r'E:\Test'
    extensions = ('.txt','.pdf')
    [zip_file.extract(file,new_dr) for file in zip_file.namelist() if file.endswith(extensions)]

if __name__ == '__main__':
    archive1()