我正在编写一个计算文件列表校验和的程序,然后将其与参考文件进行比较。
我正在尝试将hashfile
方法中的字节缓冲区转换为与os.stat(path).st_size
使用的单位相同的文件大小,以便我可以相应地更新tqdm进度条。 (试图实现最后一个例子here)
我尝试了很多事情(len(buf)
:给我的处理大小远远大于总数int.from_bytes()
:OverflowError - int太大而无法转换为float,struct.unpack_from(buf)
:需要一次读取一个字节,转换字节的各种函数)但到目前为止没有任何工作。似乎我不太了解字节,不知道要搜索什么或实现我找到的解决方案。
以下是代码的摘录:
import hashlib
import os
from tqdm import tqdm
# calculate total size to process
self.assets_size += os.stat(os.path.join(root, f)).st_size
def hashfile(self, progress, afile, hasher, blocksize=65536):
"""
Checksum buffer
:param progress: progress bar object
:param afile: file to process
:param hasher: checksum algorithm
:param blocksize: size of the buffer
:return: hash digest
"""
buf = afile.read(blocksize)
while len(buf) > 0:
self.processed_size += buf # need to convert from bytes to file size
hasher.update(buf)
progress.update(self.processed_size) # tqdm update
buf = afile.read(blocksize)
afile.seek(0)
return hasher.digest()
def process_file(self, progress, fichier):
"""
Checks if the file is in the reference dictionary;
If so, checks if the size of the file matches the one stored in the dictionary;
If so, calculates the checksum of the file and compares it to the one in the dictionary
:param progress: progress bar object
:param fichier: asset file to process
:return: string outcome of the process
"""
checksum = self.hashfile(progress, open(fichier, 'rb'), hashlib.sha1())
# check if checksum matches
return outcome
def main_process(self):
"""
Launches and monitors the process and writes a report of the results
:return: application end
"""
with tqdm(total=self.assets_size, unit='B', unit_scale=True) as pbar:
all_results = []
for f in self.assets.keys():
results = self.process_file(pbar, f)
all_results.append(results)
for r in all_results:
print(r)
答案 0 :(得分:0)
感谢@RadosławCybulski找到解决方案,我现在明白tqdm.update()函数是如何工作的:它没有将进度状态设置为参数,而是添加它。我像这样更新了hashfile方法:
while len(buf) > 0:
hasher.update(buf)
progress.update(len(buf))
buf = afile.read(blocksize)