我正在使用以下代码提取tar文件:
import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()
但是,我想密切关注目前正在提取哪些文件的进度。我怎么能这样做?
额外奖励积分:是否有可能创造一定比例的提取过程?我想用它来为tkinter更新进度条。谢谢!
答案 0 :(得分:10)
文件进度和全局进度:
import io
import os
import tarfile
def get_file_progress_file_object_class(on_progress):
class FileProgressFileObject(tarfile.ExFileObject):
def read(self, size, *args):
on_progress(self.name, self.position, self.size)
return tarfile.ExFileObject.read(self, size, *args)
return FileProgressFileObject
class TestFileProgressFileObject(tarfile.ExFileObject):
def read(self, size, *args):
on_progress(self.name, self.position, self.size)
return tarfile.ExFileObject.read(self, size, *args)
class ProgressFileObject(io.FileIO):
def __init__(self, path, *args, **kwargs):
self._total_size = os.path.getsize(path)
io.FileIO.__init__(self, path, *args, **kwargs)
def read(self, size):
print("Overall process: %d of %d" %(self.tell(), self._total_size))
return io.FileIO.read(self, size)
def on_progress(filename, position, total_size):
print("%s: %d of %s" %(filename, position, total_size))
tarfile.TarFile.fileobject = get_file_progress_file_object_class(on_progress)
tar = tarfile.open(fileobj=ProgressFileObject("a.tgz"))
tar.extractall()
tar.close()
答案 1 :(得分:4)
您可以在members
extractall()
参数
with tarfile.open(<path>, 'r') as tarball:
tarball.extractall(path=<some path>, members = track_progress(tarball))
def track_progress(members):
for member in members:
# this will be the current file being extracted
yield member
member
是TarInfo
个对象,请参阅所有可用的函数和属性here
答案 2 :(得分:3)
答案 3 :(得分:2)
这里有一个很酷的解决方案,它会覆盖tarfile模块作为替代品,并允许您指定要更新的回调。
https://github.com/thomaspurchas/tarfile-Progress-Reporter/
根据评论更新
答案 4 :(得分:1)
要查看当前正在提取的文件,以下内容对我有用:
import tarfile
print "Extracting the contents of sample.tar.gz:"
tar = tarfile.open("sample.tar.gz")
for member_info in tar.getmembers():
print "- extracting: " + member_info.name
tar.extract(member_info)
tar.close()
答案 5 :(得分:0)
您可以只使用tqdm()
并打印要提取的文件数量的进度:
import tarfile
from tqdm import tqdm
# open your tar.gz file
with tarfile.open(name=path) as tar:
# Go over each member
for member in tqdm(iterable=tar.getmembers(), total=len(tar.getmembers())):
# Extract member
tar.extract(member=member)