Question

我试图在python中使用多处理库，但遇到了一些困难：

def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    return response.json()

def get_list_event_per_user_per_mpm(limit=100):
    nb_unique_user = get_unique_user()
    print "Unique user: ", nb_unique_user
    processor_pool = multiprocessing.Pool(4)
    offset = range(0, nb_unique_user, limit)
    list_event_per_user = processor_pool.map(request_solr(limit), offset)
    return list_event_per_user

我不确定如何将第二个参数传递给函数。我怎样才能让它发挥作用。我收到了以下错误：

TypeError: 'dict' object is not callable

Answer 1

您看到该错误，因为您在将函数传递给多处理之前调用了函数。

我建议您将starmap与itertools.repeat结合使用：

import itertools as it # rest of your code processor_pool = multiprocessing.Pool(4) offset = range(0, nb_unique_user, limit) list_event_per_user = processor_pool.starmap(request_solr, zip(it.repeat(limit), offset))

Starmap会调用您的函数将这对值扩展为两个参数。 repeat(limit)只生成一个迭代，其所有元素都等于limit。

这适用于任意数量的参数：

def my_function(a, b, c, d, e): return a+b+c+d+e pool = Pool() pool.starmap(my_function, [(1,2,3,4,5)]) # calls my_function(1,2,3,4,5)

由于您使用的是旧版本的python，因此您必须通过修改函数或使用包装函数来解决此问题：

def wrapper(arguments): return request_solr(*arguments) # later: pool.map(wrapper, zip(repeat(limit), offset))

Answer 2

你需要使用lambda。你现在正在这样做，它试图将* What went wrong: Execution failed for task ':ideaModule'. > You must specify at least one directory for a flat directory repository. Is there another place to set the flat directory?的结果映射为一个以request_solr为参数的函数。

这应该可以解决问题。

offset

注意，这仅适用于3.x.在2.x中，您需要创建一个函数对象。例如：

processor_pool.map(lambda x: request_solr(limit, x), offset)

Answer 3

我曾经使用生成器来生成关键字。这是我的simple_multiproc.py的内容。

请注意在level模块中使用request_solr的重要性。

import multiprocessing

MAX=5

def _get_pool_args(**kw):
    for _ in range(MAX):
        r = {"limit": 10, "offset": 10}
        r.update(kw)
        yield r


def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    print(locals())
    response.json()

if __name__ == "__main__":
    pool = multiprocessing.Pool(MAX)
    pool.map(request_solr, _get_pool_args())

多处理python中的一个函数，它有多个参数

3 个答案: