Question

我有以下问题。我编写了一个函数，它将列表作为输入，并为列表中的每个元素创建一个字典。然后我想将这个字典附加到一个新列表，所以我得到一个字典列表。我正在尝试为此生成多个进程。我的问题在于我希望不同的进程访问字典列表，因为它由其他进程更新，例如在达到一定长度后打印一些东西。我的例子是这样的：

import multiprocessing

list=['A', 'B', 'C', 'D', 'E', 'F']

def do_stuff(element):
    element_dict={}
    element_dict['name']=element
    new_list=[]
    new_list.append(element_dict)
    if len(new_list)>3:
        print 'list > 3'

###Main###
pool=multiprocessing.Pool(processes=6)
pool.map(do_stuff, list)
pool.close()

现在我的问题是每个进程都创建了自己的new_list。有没有办法在进程之间共享列表，以便所有字典都附加到同一个列表中？或者是在函数外部定义new_list的唯一方法吗？

Answer 1

一种方法是使用管理器对象并从中创建共享列表对象：

from multiprocessing import Manager, Pool

input_list = ['A', 'B', 'C', 'D', 'E', 'F']

manager = Manager()
shared_list = manager.list()

def do_stuff(element):
    global shared_list
    element_dict = {}
    element_dict['name'] = element
    shared_list.append(element_dict)
    if len(shared_list) > 3:
        print('list > 3')

pool = Pool(processes=6)
pool.map(do_stuff, input_list)
pool.close()

请记住，与线程不同，进程不共享内存空间。（当产生时，每个进程都会获得自己的产生进程内存占用副本，然后随之运行。）因此，它们只能通过某种形式的IPC（进程间通信）进行通信。在Python中，一种这样的方法是multiprocessing.Manager及其公开的数据结构，例如list或dict。这些在代码中使用就像它们的内置等价物一样容易，但在引擎盖下使用某种形式的IPC（可能是套接字）。

Answer 2

以下来自python documentation：

from multiprocessing import shared_memory
a = shared_memory.ShareableList(['howdy', b'HoWdY', -273.154, 100, None, True, 42])
[ type(entry) for entry in a ]
[<class 'str'>, <class 'bytes'>, <class 'float'>, <class 'int'>, <class 'NoneType'>, <class 'bool'>, <class 'int'>]
a[2]
-273.154
a[2] = -78.5
a[2]
-78.5
a[2] = 'dry ice'  # Changing data types is supported as well
a[2]
'dry ice'
a[2] = 'larger than previously allocated storage space'
Traceback (most recent call last):
  ...
ValueError: exceeds available storage for existing str
a[2]
'dry ice'
len(a)
7
a.index(42)
6
a.count(b'howdy')
0
a.count(b'HoWdY')
1
a.shm.close()
a.shm.unlink()
del a  # Use of a ShareableList after call to unlink() is unsupported

在python中的不同进程之间共享一个列表

2 个答案: