我的模块也是一个脚本,它调用一些使用多处理的内部定义函数。
将模块作为脚本运行在Windows和Linux上运行正常。从另一个python脚本调用它的main函数在Linux上运行正常但在Windows上运行不正常。
当我的模块调用Process的multiprocessing.Process
函数时,核心的多处理函数(作为target
传递给start()
构造函数的函数)永远不会被执行。
该模块必须对此用法做太多要求(从脚本调用时在Windows上进行多处理),但是如何才能找到此问题的根源?
这是一些演示行为的示例代码。首先是模块:
# -*- coding: utf-8 -*-
'my_mp_module.py'
import argparse
import itertools
import Queue
import multiprocessing
def meaty_function(**kwargs):
'Do a meaty calculation using multiprocessing'
task_values = kwargs['task_values']
# Set up a queue of tasks to perform, one for each element in the task_values array
in_queue = multiprocessing.Queue()
out_queue = multiprocessing.Queue()
reduce(lambda a, b: a or b,
itertools.imap(in_queue.put, enumerate(task_values)))
core_procargs=(
in_queue ,
out_queue,
)
core_processes = [multiprocessing.Process(target=_core_function,
args=core_procargs) for ii in xrange(len(task_values))]
for p in core_processes:
p.daemon = True # I've tried both ways, setting this to True and False
p.start()
sum_of_results = 0
for result_count in xrange(len(task_values)):
a_result = out_queue.get(block=True)
sum_of_results += a_result
for p in core_processes:
p.join()
return sum_of_results
def _core_function(inp_queue, out_queue):
'Perform the core calculation for each task in the input queue, placing the results in the output queue'
while 1:
try:
task_idx, task_value = inp_queue.get(block=False)
# Perform a calculation with this task value.
task_result = task_idx + task_value # The real calculation is more complicated than this
out_queue.put(task_result)
except Queue.Empty:
break
def get_command_line_arguments(command_line=None):
'parse the given command_line (list of strings) or from sys.argv, return the corresponding argparse.Namespace object'
aparse = argparse.ArgumentParser(description=__doc__)
aparse.add_argument('--task_values', '-t',
action='append',
type=int,
help='''The value for each task to perform.''')
return aparse.parse_args(args=command_line)
def main(command_line=None):
'perform a meaty calculation with the input from the command line, and print the results'
# collect input from the command line
args=get_command_line_arguments(command_line)
keywords = vars(args)
# perform a meaty calculation with the input
meaty_results = meaty_function(**keywords)
# display the results
print(meaty_results)
if __name__ == '__main__':
multiprocessing.freeze_support()
main(command_line=None)
现在调用模块的脚本:
# -*- coding: utf-8 -*-
'my_mp_script.py:'
import my_mp_module
import multiprocessing
multiprocessing.freeze_support()
my_mp_module.main(command_line=None)
将模块作为脚本运行会产生预期的结果:
C:\Users\greg>python -m my_mp_module -t 0 -t 1 -t 2
6
但是运行另一个只调用模块main()
函数的脚本会在Windows下显示一条错误消息(这里我删除了从多个进程中复制的错误消息):
C:\Users\greg>python my_mp_script.py -t 0 -t 1 -t 2
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 510, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\greg\Documents\PythonCode\Scripts\my_mp_script.py", line 7, in <module>
my_mp_module.main(command_line=None)
File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 72, in main
meaty_results = meaty_function(**keywords)
File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 28, in meaty_function
p.start()
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 258, in __init__
cmd = get_command_line() + [rhandle]
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 358, in get_command_line
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
答案 0 :(得分:0)
Linux和Windows在创建其他流程方面的工作方式略有不同。 Linux forks
代码但是Windows创建了一个新的Python解释器来运行生成的进程。这里的效果是所有代码都被重新加载,就像它是第一次一样。有一个类似的问题,可以提供信息,看看...... How to stop multiprocessing in python running for the full script。
此处的解决方案是修改my_mp_script.py
脚本,以便对my_mp_module.main()
的调用进行保护。
import my_mp_module
import multiprocessing
if __name__ == '__main__':
my_mp_module.main(command_line=None)
请注意,我现在还删除了freeze_support()
功能,但是如果需要,可以接受这些功能。