使用Python2.6:在我的多处理实现中,某些工作进程在成功处理所有列出的文件之前就变成了僵尸。在大多数情况下,这是无害的,只是减慢了处理速度,因为其余工人可以完成任务。但是有时候所有的工作人员都变成僵尸,僵尸脚本停止运行并停止进一步的目录迭代。
我要遍历目录列表中的文件列表,一次访问一个目录,并且正在使用多处理模块来减少处理时间。但是,有时,由于我不负责的另一个程序中的复杂性,无法处理特定文件。为了解决这个问题,我添加了一个TimeoutException类,以将失败的文件放回到队列中,如果它们在特定时间内没有完成,则由另一个工作程序重新处理。
def f_init(q):
processMethod.q = q
class TimeoutException(Exception):
pass
def handler(signum, frame):
raise TimeoutException()
def processMethod(f):
limit = 84
try:
signal.signal(signal.SIGALRM, handler)
signal.alarm(240)
{data processing}
newfiles = len(glob.glob("*" + "fdate.jpg"))
if newfiles < limit:
time.sleep(240)
return 1
except TimeoutException:
processMethod.q.put(f)
return None
def main(directory)
total_items = len(directory)
successful = []
failure_tracker = []
q = Queue()
p = Pool(15, f_init, [q])
results = p.imap(processMethod, directory)
retry_results = []
while len(successful) < total_items:
successful.extend([r for r in results if not r is None])
successful.extend([r for r in retry_results if not r is None])
failed_items = []
while not q.empty():
failed_items.append(q.get())
if failed_items:
failure_tracker.append(failed_items)
retry_results = p.imap(processMethod, failed_items)
p.close()
p.join()
return
if __name__ == "__main__":
directory = os.listdir("/sourcedir")
main(directory)
我不明白是什么原因导致了错误。我希望,如果任何过程花费的时间超过240秒,它将被踢回到main()并将文件添加到“ failed_items”。到目前为止,所有失败的文件都已得到正确处理,但辅助进程有时仍挂起。以下是在给定工作进程被僵化的情况下向终端输出的回溯示例:
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 57, in worker
task = get()
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 350, in get
racquire()
File "/my/home/dir/myScript.py", line 47, in handler
raise TimeoutException()
TimeoutException
有时回溯会稍有不同:
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 57, in worker
task = get()
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 352, in get
return recv()
File "/my/home/dir/myScript.py", line 47, in handler
raise TimeoutException()
TimeoutException
这是引发TimeoutException的问题还是与池/队列本身有关的问题?由于悬挂过程的零星性质,我感到非常困惑。