从脚本

时间:2018-01-18 02:38:35

标签: python windows multiprocessing

我的模块也是一个脚本,它调用一些使用多处理的内部定义函数。

将模块作为脚本运行在Windows和Linux上运行正常。从另一个python脚本调用它的main函数在Linux上运行正常但在Windows上运行不正常。

当我的模块调用Process的multiprocessing.Process函数时,核心的多处理函数(作为target传递给start()构造函数的函数)永远不会被执行。

该模块必须对此用法做太多要求(从脚本调用时在Windows上进行多处理),但是如何才能找到此问题的根源?

这是一些演示行为的示例代码。首先是模块:

# -*- coding: utf-8 -*-
'my_mp_module.py'

import argparse
import itertools
import Queue
import multiprocessing


def meaty_function(**kwargs):
    'Do a meaty calculation using multiprocessing'

    task_values = kwargs['task_values']

    # Set up a queue of tasks to perform, one for each element in the task_values array
    in_queue  = multiprocessing.Queue()
    out_queue = multiprocessing.Queue()
    reduce(lambda a, b: a or b,
           itertools.imap(in_queue.put, enumerate(task_values)))

    core_procargs=(
                    in_queue ,
                    out_queue,
                    )
    core_processes = [multiprocessing.Process(target=_core_function,
                                              args=core_procargs) for ii in xrange(len(task_values))]
    for p in core_processes:
        p.daemon = True # I've tried both ways, setting this to True and False
        p.start()

    sum_of_results = 0
    for result_count in xrange(len(task_values)):
        a_result = out_queue.get(block=True)
        sum_of_results += a_result
    for p in core_processes:
        p.join()

    return sum_of_results


def _core_function(inp_queue, out_queue):
    'Perform the core calculation for each task in the input queue, placing the results in the output queue'
    while 1:
        try:
            task_idx, task_value = inp_queue.get(block=False)
            # Perform a calculation with this task value.
            task_result = task_idx + task_value # The real calculation is more complicated than this
            out_queue.put(task_result)

        except Queue.Empty:
            break


def get_command_line_arguments(command_line=None):
    'parse the given command_line (list of strings) or from sys.argv, return the corresponding argparse.Namespace object'
    aparse = argparse.ArgumentParser(description=__doc__)
    aparse.add_argument('--task_values', '-t',
                        action='append',
                        type=int,
                        help='''The value for each task to perform.''')
    return aparse.parse_args(args=command_line)


def main(command_line=None):
    'perform a meaty calculation with the input from the command line, and print the results'

    # collect input from the command line
    args=get_command_line_arguments(command_line)
    keywords = vars(args)

    # perform a meaty calculation with the input
    meaty_results = meaty_function(**keywords)

    # display the results
    print(meaty_results)


if __name__ == '__main__':
    multiprocessing.freeze_support()
    main(command_line=None)

现在调用模块的脚本:

# -*- coding: utf-8 -*-
'my_mp_script.py:'

import my_mp_module
import multiprocessing

multiprocessing.freeze_support()
my_mp_module.main(command_line=None)

将模块作为脚本运行会产生预期的结果:

C:\Users\greg>python -m my_mp_module  -t 0 -t 1 -t 2
6

但是运行另一个只调用模块main()函数的脚本会在Windows下显示一条错误消息(这里我删除了从多个进程中复制的错误消息):

C:\Users\greg>python my_mp_script.py  -t 0 -t 1 -t 2
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 380, in main
    prepare(preparation_data)
  File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 510, in prepare
    '__parents_main__', file, path_name, etc
  File "C:\Users\greg\Documents\PythonCode\Scripts\my_mp_script.py", line 7, in <module>
    my_mp_module.main(command_line=None)
  File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 72, in main
    meaty_results = meaty_function(**keywords)
  File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 28, in meaty_function
    p.start()
  File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\process.py", line 130, in start
    self._popen = Popen(self)
  File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 258, in __init__
    cmd = get_command_line() + [rhandle]
  File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 358, in get_command_line
    is not going to be frozen to produce a Windows executable.''')
RuntimeError:
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.

            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:

                if __name__ == '__main__':
                    freeze_support()
                    ...

            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

1 个答案:

答案 0 :(得分:0)

Linux和Windows在创建其他流程方面的工作方式略有不同。 Linux forks代码但是Windows创建了一个新的Python解释器来运行生成的进程。这里的效果是所有代码都被重新加载,就像它是第一次一样。有一个类似的问题,可以提供信息,看看...... How to stop multiprocessing in python running for the full script

此处的解决方案是修改my_mp_script.py脚本,以便对my_mp_module.main()的调用进行保护。

import my_mp_module
import multiprocessing

if __name__ == '__main__':
    my_mp_module.main(command_line=None)

请注意,我现在还删除了freeze_support()功能,但是如果需要,可以接受这些功能。