Question

在下面的代码中，我收到有关“无法在模块 main 上获取属性'f'”的错误。我知道如何解决：将池线和结果线都移到结果2的正上方。

我的问题是为什么当前形式的代码不起作用。我正在使用更复杂的代码，其中我必须在两个不同的单独for循环内使用并行处理。现在，我在每个for循环的每次迭代中都有pool = mp.Pool（3）。我在网上阅读这很不好，因为在每次迭代中，我都会创建更多的Pool“工人”。如何将pool = mp.Pool（3）放在迭代的外部，然后在我需要的代码的所有不同区域中使用相同的Pool工作器？

为了记录，我正在使用Mac运行我的代码。

import numpy as np
import multiprocessing as mp

x = np.array([1,2,3,4,5,6])

pool = mp.Pool(3)

def f(x):
    return x**2

result = pool.map(f,x)

def g(x):
    return x + 1

result2 = pool.map(g,x)
print('result=',result,'and result2=',result2)

Answer 1

使用“ fork”方法创建子流程时（Mac OS的默认设置），在创建Pool时会分叉（基本复制）这些流程。这意味着在您的代码中，分叉的子代尚未执行f的创建，而是等待主进程中的任务。

首先，您不应直接在脚本中执行“活动”代码（除了定义函数，类，常量之外），而应将其移至函数中。您的代码如下所示：

import numpy as np
import multiprocessing as mp


def f(x):
    return x**2

def g(x):
    return x + 1

def main():
    x = np.array([1,2,3,4,5,6])

    pool = mp.Pool(3)

    result = pool.map(f,x)
    result2 = pool.map(g,x)
    print('result=',result,'and result2=',result2)

# Should be nearly the only "active" statement
main()

我想也许是更好的选择

import numpy as np
import multiprocessing as mp


def f(x):
    return x**2

def g(x):
    return x + 1

def proc_f():
    global x, pool
    return pool.map(f,x)

def proc_g():
    global x, pool
    return pool.map(g,x)

def main():
    global x, pool
    x = np.array([1,2,3,4,5,6])

    pool = mp.Pool(3)

    result = proc_f()
    result2 = proc_g()
    print('result=',result,'and result2=',result2)

# Should be nearly the only "active" statement
main()

如何在多处理代码中重用池工作程序？

1 个答案: