Question

我需要一些方法来在pool.map（）中使用一个接受多个参数的函数。根据我的理解，pool.map（）的目标函数只能有一个iterable作为参数，但有没有办法可以传递其他参数？在这种情况下，我需要传递一些配置变量，比如我的Lock（）和记录信息到目标函数。

我曾尝试做一些研究，我认为我可以使用部分功能让它发挥作用？但是我不完全理解这些是如何工作的。任何帮助将不胜感激！以下是我想要做的一个简单示例：

def target(items, lock):
    for item in items:
        # Do cool stuff
        if (... some condition here ...):
            lock.acquire()
            # Write to stdout or logfile, etc.
            lock.release()

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    pool.map(target(PASS PARAMS HERE), iterable)
    pool.close()
    pool.join()

Answer 1

您可以使用functools.partial（如您所疑）：

from functools import partial

def target(lock, iterable_item):
    for item in iterable_item:
        # Do cool stuff
        if (... some condition here ...):
            lock.acquire()
            # Write to stdout or logfile, etc.
            lock.release()

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    l = multiprocessing.Lock()
    func = partial(target, l)
    pool.map(func, iterable)
    pool.close()
    pool.join()

示例：

def f(a, b, c):
    print("{} {} {}".format(a, b, c))

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    a = "hi"
    b = "there"
    func = partial(f, a, b)
    pool.map(func, iterable)
    pool.close()
    pool.join()

if __name__ == "__main__":
    main()

输出：

hi there 1
hi there 2
hi there 3
hi there 4
hi there 5

Answer 2

您可以使用允许多个参数的map函数，multiprocessing中pathos的分叉也是如此。

>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> 
>>> def add_and_subtract(x,y):
...   return x+y, x-y
... 
>>> res = Pool().map(add_and_subtract, range(0,20,2), range(-5,5,1))
>>> res
[(-5, 5), (-2, 6), (1, 7), (4, 8), (7, 9), (10, 10), (13, 11), (16, 12), (19, 13), (22, 14)]
>>> Pool().map(add_and_subtract, *zip(*res))
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

pathos使您能够轻松地嵌套具有多个输入的分层并行映射，因此我们可以扩展我们的示例来演示。

>>> from pathos.multiprocessing import ThreadingPool as TPool
>>> 
>>> res = TPool().amap(add_and_subtract, *zip(*Pool().map(add_and_subtract, range(0,20,2), range(-5,5,1))))
>>> res.get()
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

更有趣的是构建一个我们可以传递到池中的嵌套函数。这是可能的，因为pathos使用dill，它可以序列化python中的几乎任何内容。

>>> def build_fun_things(f, g):
...   def do_fun_things(x, y):
...     return f(x,y), g(x,y)
...   return do_fun_things
... 
>>> def add(x,y):
...   return x+y
... 
>>> def sub(x,y):
...   return x-y
... 
>>> neato = build_fun_things(add, sub)
>>> 
>>> res = TPool().imap(neato, *zip(*Pool().map(neato, range(0,20,2), range(-5,5,1))))
>>> list(res)
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

但是，如果您无法走出标准库，则必须采用另一种方式。在这种情况下，你最好的选择是使用multiprocessing.starmap，如下所示：Python multiprocessing pool.map for multiple arguments（@Roberto在OP的帖子评论中注明）

在此处获取pathos：https://github.com/uqfoundation

Answer 3

如果您无法访问functools.partial，您也可以使用包装函数。

def target(lock):
    def wrapped_func(items):
        for item in items:
            # Do cool stuff
            if (... some condition here ...):
                lock.acquire()
                # Write to stdout or logfile, etc.
                lock.release()
    return wrapped_func

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    lck = multiprocessing.Lock()
    pool.map(target(lck), iterable)
    pool.close()
    pool.join()

这会使target()成为一个接受锁（或任何你想要给出的参数）的函数，并且它将返回一个只接受一个可迭代作为输入的函数，但仍然可以使用所有其他参数。这就是最终传递给pool.map()的内容，然后应该毫无问题地执行。

将多个参数传递给Python中的pool.map（）函数

3 个答案: