我有一个包含文件和子zip文件夹的zip文件夹。我能够读取父文件夹中的文件但是如何访问子zip文件夹中的文件?这是我的代码,以获取父文件夹中的文件
from io import BytesIO
import pandas as pd
import requests
import zipfile
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))
temp = pd.read_csv(z.open('mno.csv')
我的问题是,如果我们说,我有一个子子文件夹
xyz.zip
包含文件
pqr.csv
我该怎么读这个文件
答案 0 :(得分:1)
使用另一个BytesIO
对象打开包含的zip文件
from io import BytesIO
import pandas as pd
import requests
import zipfile
# Read outer zip file
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))
# lets say the archive is:
# zippped_folder/pqr.zip (which contains pqr.csv)
# Read contained zip file
pqr_zip = zipfile.ZipFile(BytesIO(z.open('zippped_folder/pqr.zip')))
temp = pd.read_csv(pqr_zip.open('prq.csv'))
答案 1 :(得分:1)
在尝试了一些排列组合后,我用这段代码来解决问题
zz = zipfile.ZipFile(z.namelist()[i])
temp2 = pd.read_csv(zz.open('pqr.csv'))
# where i is the index position of the child zip folder in the namelist() list. In this case, the 'xyz.zip' folder
# for eg if the 'xyz.zip' folder was third in the list, the command would be:
zz = zipfile.ZipFile(z.namelist()[2])
或者,如果索引位置未知,则可以这样实现:
zz = zipfile.ZipFile(z.namelist()[z.namelist().index('xyz.zip')])