我的目录中有很多压缩文件,我想获取每个zip文件的文件数量。例如,让我们说zip文件" nature.zip",我希望得到以下输出:
file_name file_format
nature jpg 2, png 1
到目前为止,我设法打印内容,但我不知道如何向前推进
from zipfile import ZipFile
import os
directory = r"C:\Users\Lenovo\data_2"
for folder, subfolders, files in os.walk(directory):
for file in files:
if file.endswith(".zip"):
# opening the zip file in READ mode
with ZipFile(directory+ '/'+ file, 'r') as zip:
# printing all the contents of the zip file
zip.printdir()
非常感谢
答案 0 :(得分:2)
这是一个例子。这将拉链内的文件按字典中的扩展名分组并打印输出。根据您的情况需要进行调整。
#Filegroup.py
from zipfile import ZipFile
from glob import glob
print "file_name","\t","file_format"
for zips in glob('*.zip'):
with ZipFile(zips) as zip:
files = zip.namelist()
filecounts = {}
for file in files:
ext = file.split('.')[-1]
if ext in filecounts:
filecounts[ext] += 1
else:
filecounts[ext] = 1
print zip.filename,'\t\t',', '.join([' '.join(map(str,elem)) for elem in filecounts.items()])
测试:
$ zipinfo -1 A.zip
a.txt
b.txt
c.jpg
k.png
$ zipinfo -1 B.zip
g.md
h.txt
e.png
f.png
d.jpg
$ python Filegroup.py
file_name file_format
A.zip txt 2, png 1, jpg 1
B.zip md 1, txt 1, jpg 1, png 2