这是我的第一篇SO帖子!来了!
我正在尝试使用Queue.Manager类实现Python多处理(Python 3.6),但是我发现虽然在功能上,我的代码有效,但并没有关闭子进程。
def execute_parallel(self):
""" Controlling function, function spawns worker processes and loads the task queue,
monitors the complete_processes queue and terminates once process has finished
"""
manager = multiprocessing.Manager()
# Define a list (queue) for tasks and complete_processes
tasks = manager.Queue()
complete_processes = manager.Queue()
# Create process pool with the required number of processes
num_processes = self.degree_of_parallel
pool = multiprocessing.Pool(processes=num_processes)
processes = []
# Fill task queue with all of the required tasks from the calling module.
queue_length = 0
for single_task in self.task:
tasks.put(single_task)
queue_length += 1
self.logger.info('Number of tasks in Queue : {}'.format(str(queue_length)))
self.logger.info('INFO: Tasks in Queue : {}'.format(tasks.qsize()))
# Quit the worker processes by sending them -1, append these to the end of the queue
# One quit signal is required per number of Parallel Processes specified.
for i in range(num_processes):
tasks.put(-1)
for i in range(num_processes):
p = multiprocessing.Process(target=self.workers, args=('P{}'.format(i), tasks, complete_processes), daemon=True)
p.start()
processes.append(p)
for p in processes:
p.join()
我的Worker函数非常简单,它从Queue中拉出直到队列为空,出于测试目的,我调用了一个睡眠10秒的简单函数。
def workers(self,process_name, tasks, complete_processes):
""" Worker function, this executes the function, getting the task from the task list
populating the result list on completion
will break and kill the process, when -1 is present on the task list.
"""
self.logger.info('{} Worker : Process Initiated'.format(process_name))
try:
while not tasks.empty():
new_file = tasks.get()
if new_file != -1:
print('Parallel Process {} : PID is {}'.format(process_name, os.getpid()))
self.logger.info('{} Worker : Started Processing {}'.format(process_name, os.path.basename(new_file[0])))
self.sleep()
complete_processes.put(new_file) # Load the Complete Processes Queue
self.logger.info('{} Worker : Finished Processing {}'.format(process_name, os.path.basename(new_file[0])))
print('Parallel Process {} : PID {} Finished'.format(process_name, os.getpid()))
else:
self.logger.info('{} Worker : Process Quits'.format(process_name))
break
except:
self.logger.info('All Tasks Completed, Queue empty')
return
但是,当在操作系统级别监视PID时,进程数量继续增长。
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
11
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
15
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
19
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
23
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
26
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
31
[root@632876de6086 /]# ps -ef | grep python3 | wc -l
35
在我的测试环境中,这并不是真正的问题,但是在生产环境中,这产生了数千个子进程,并最终导致了资源争用和内存问题。
任何帮助,批评或其他方式,将不胜感激!