在for循环中使用Multiprocessing.Pool的意外行为

时间:2019-03-19 22:28:38

标签: python python-2.7 python-multiprocessing pool

这是我的代码:

$ git push origin master --tags
Counting objects: 1, done.
Writing objects: 100% (1/1), 196 bytes | 0 bytes/s, done.
Total 1 (delta 0), reused 0 (delta 0)
To git@github.com:org_name/repo_name.git
* [new tag]         v0.5 -> v0.5

控制台将打印数组[3,6,9,...,300] 3次,每个数组打印输出之间有整数1,2,3。所以我正确地在上下之间(不包括两端)进行迭代,但是我希望它先打印出数组[1、2、3,...,100],然后再打印[2、4、6,..., 200],最后是[3,6,9,...,300]。我不明白为什么它只将i的最终值传递给foo,然后将其映射三次。

2 个答案:

答案 0 :(得分:1)

运行新流程时,将看到以下内容:

import multiprocessing as mp
import numpy as np

def foo(p):
    global i
    return p*i

global lower, upper
lower = 1
upper = 4

for i in range(lower, upper):
    if __name__ == '__main__':
        # This part is not run, as
        # in a different process,
        # __name__ is set to '__mp_main__'
# i is now `upper - 1`, call `foo(p)` with the provided `p`

执行完之后,将告诉它运行foo(由于酸洗的工作原理,它必须再次运行整个脚本以找出foo是什么)

因此,运行后,i将为upper - 1,并且它将始终返回p * 3

您要使i成为foo的参数,或者像某些here所指定的某些多处理特定的内存共享对象

答案 1 :(得分:1)

将我设为本地并使用functools.partial可能会解决您的问题:

import multiprocessing as mp
import numpy as np
import functools

def foo(p,i):
    return p*i

global lower, upper
lower = 1
upper = 4

for i in range(lower, upper):
    if __name__ == '__main__':
        dataset = np.linspace(1, 100, 100)
        agents = mp.cpu_count() - 1
        chunksize = 5
        pool = mp.Pool(processes=agents)
        foo2 = functools.partial(foo, i)
        result = pool.map(foo2, dataset, chunksize)
        print(result)
        print(i)
        pool.close()
        pool.join()