并行执行列表中每个对象的方法

时间:2018-11-27 16:42:38

标签: python parallel-processing python-multiprocessing

我有一个对象列表,我想在每个对象中并行执行一个方法。该方法修改对象的属性。例如:

class Object:
    def __init__(self, a):
        self.a = a
    def aplus(self):
        self.a += 1

object_list = [Object(1), Object(2), Object(3)]

# I want to execute this in parallel
for i in range(len(object_list)):
    object_list[i].aplus() 

我尝试了以下操作:

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor

executor = ProcessPoolExecutor(max_workers=3)
res = executor.map([obj.aplus for obj in object_list])

这不起作用,使对象保持不变。我认为这是因为对象只能通过多处理进行复制,而不能访问。有想法吗?

非常感谢!

编辑:假定对象很大,所以最好避免将它们复制到每个进程。据推测,这些方法还占用大量CPU,因此应使用多个进程而不是线程。在这种情况下,我相信没有解决方案,因为多处理无法共享内存,线程无法使用多个CPU。我想被证明是错误的。

3 个答案:

答案 0 :(得分:2)

这是我的答案,使用threading

from threading import Thread

class Object:
    def __init__(self, a):
        self.a = a
    def aplus(self):
        self.a += 1

object_list = [Object(1), Object(2), Object(3)]

# A list containing all threads we will create
threads = []

# Create a thread for every objects
for obj in object_list:
    thread = Thread(target=obj.aplus)
    thread.daemon = True
    thread.start()
    threads.append(thread)

# Wait for all threads to finish before continuing
for thread in threads:
    thread.join();

# prints results
for obj in object_list:
    print(obj.a)

答案 1 :(得分:1)

以下是使用Pool.map的有效示例:

import multiprocessing

class Object:
    def __init__(self, a):
        self.a = a

    def aplus(self):
        self.a += 1

    def __str__(self):
        return str(self.a)

def worker(obj):
    obj.aplus()
    return obj

if __name__ == "__main__":
    object_list = [Object(1), Object(2), Object(3)]

    try:
        processes = multiprocessing.cpu_count()
    except NotImplementedError:
        processes = 2

    pool = multiprocessing.Pool(processes=processes)
    modified_object_list = pool.map(worker, object_list)

    for obj in modified_object_list:
        print(obj)

打印:

2
3
4

答案 2 :(得分:1)

  

我认为这是因为只能复制对象,而不能复制对象   进行多处理访问。

这是正确的,只是答案的一半。因为进程是隔离的,所以每个进程都有自己的object_list副本。一种解决方案是使用ThreadPoolExecutor(所有线程共享相同的object_list)。

使用的语法与您尝试使用的语法略有不同,但这按预期进行:

executor = ThreadPoolExecutor(max_workers=3)
res = executor.map(Object.aplus, object_list)

如果您确实要使用ProcessPoolExecutor,则需要以某种方式从流程中获取数据。最简单的方法是使用返回值的函数:

from concurrent.futures import ProcessPoolExecutor


class Object:
    def __init__(self, a):
        self.a = a

    def aplus(self):
        self.a += 1
        return self.a


if __name__ == '__main__':

    object_list = [Object(1), Object(2), Object(3)]

    executor = ProcessPoolExecutor(max_workers=3)
    for result in executor.map(Object.aplus, object_list):
        print("I got: " + str(result))

您甚至可以拥有要map返回self的功能,然后将这些返回的对象放回object_list中,然后结束。因此,完整的多处理解决方案如下所示:

from concurrent.futures import ProcessPoolExecutor


class Object:
    def __init__(self, a):
        self.a = a

    def aplus(self):
        self.a += 1
        return self


if __name__ == '__main__':

    object_list = [Object(1), Object(2), Object(3)]

    executor = ProcessPoolExecutor(max_workers=3)
    object_list = list(executor.map(Object.aplus, object_list))