我如何每隔五秒报告一次处理多少文件?我想我需要线程,但它是如何控制的?
#!/bin/env python
# -*- coding: utf8 -*-
import os
import sys
import logging
import hashlib
logger = logging.getLogger()
FORMAT = "%(asctime)s %(levelname)s: %(message)s"
logging.basicConfig(format=FORMAT, level=logging.DEBUG, datefmt="%H:%M:%S")
class fileScanner:
readBytes = 0
lastReadBytes = 0
fileSize = 0
reportSeconds = 5
def scanFile(self, filePath):
self.readBytes = 0
self.lastReadBytes = 0
logging.getLogger()
self.fileSize = os.path.getsize(filePath)
with open(filePath, 'rb') as f:
m = hashlib.sha512()
while True:
data = f.read(1024)
if not data:
break
self.readBytes += len(data)
m.update(data)
return m.hexdigest()
raise IOError("Couldn't process file '%s'" % filePath)
def reportProcess(self):
logging.getLogger()
percent = float((self.readBytes / self.fileSize) * 100)
secAvg = (self.readBytes - self.lastReadBytes) / self.reportSeconds
estimatedTime = (self.fileSize - self.readBytes) / secAvg
logging.info("%s%% (%s / %s bytes) read in average of %s MB / sec. Estimated time left: %s seconds." % (percent, self.readBytes, self.fileSize, secAvg, estimatedTime))
self.lastReadBytes = self.readBytes
if __name__ == "__main__":
fs = fileScanner()
hash = fs.scanfile('largefile.dat')
我如何开始和结束reportProcess()?
是的我知道那里的计算可能是错误的。
答案 0 :(得分:1)
每隔5秒就可以在读取循环中调用reportProcess
,例如
lastTime = time.time()
while True:
data = f.read(1024)
if not data:
break
self.readBytes += len(data)
if time.time() - lastTime > 5:
self.reportProcess()
lastTime = time.time()
不相关:为什么使用类级别属性,通常它们应该在实例级别,例如。
class FileScanner:
def __init__(self):
self.readBytes = 0
self.lastReadBytes = 0
答案 1 :(得分:0)
您是否可以在reportProcess()
循环内的scanFile()
函数内调用while
。例如,每读取一个x字节就会调用reportProcess()
(在while循环中添加条件)。这会解决你的问题吗?