Python管理器在多处理中的字典

时间:2011-12-27 01:22:04

标签: python multiprocessing

这是一个简单的多处理代码:

from multiprocessing import Process, Manager

manager = Manager()
d = manager.dict()

def f():
    d[1].append(4)
    print d

if __name__ == '__main__':
    d[1] = []
    p = Process(target=f)
    p.start()
    p.join()

我得到的输出是:

{1: []}

为什么我不将{1: [4]}作为输出?

4 个答案:

答案 0 :(得分:26)

这是你写的:

# from here code executes in main process and all child processes
# every process makes all these imports
from multiprocessing import Process, Manager

# every process creates own 'manager' and 'd'
manager = Manager() 
# BTW, Manager is also child process, and 
# in its initialization it creates new Manager, and new Manager
# creates new and new and new
# Did you checked how many python processes were in your system? - a lot!
d = manager.dict()

def f():
    # 'd' - is that 'd', that is defined in globals in this, current process 
    d[1].append(4)
    print d

if __name__ == '__main__':
# from here code executes ONLY in main process 
    d[1] = []
    p = Process(target=f)
    p.start()
    p.join()

这是你应该写的:

from multiprocessing import Process, Manager
def f(d):
    d[1] = d[1] + [4]
    print d

if __name__ == '__main__':
    manager = Manager() # create only 1 mgr
    d = manager.dict() # create only 1 dict
    d[1] = []
    p = Process(target=f,args=(d,)) # say to 'f', in which 'd' it should append
    p.start()
    p.join()

答案 1 :(得分:10)

我认为这是管理员代理呼叫中的一个错误。您可以绕过避免共享列表的调用方法,例如:

from multiprocessing import Process, Manager

manager = Manager()
d = manager.dict()

def f():
    # get the shared list
    shared_list = d[1]

    shared_list.append(4)

    # forces the shared list to 
    # be serialized back to manager
    d[1] = shared_list

    print d

if __name__ == '__main__':
    d[1] = []
    p = Process(target=f)
    p.start()
    p.join()

    print d

答案 2 :(得分:8)

Python's official documentation中说明了未附加到d[1]的新项目的原因:

  

对dict和列表代理中的可变值或项的修改将会   不能通过管理器传播,因为代理无法通过   知道何时修改其值或项目。要修改这样的项目,   您可以将修改后的对象重新分配给容器代理。

因此,实际上会发生这种情况:

from multiprocessing import Process, Manager

manager = Manager()
d = manager.dict()

def f():
    # invoke d.__getitem__(), returning a local copy of the empty list assigned by the main process,
    # (consider that a KeyError exception wasn't raised, so a list was definitely returned),
    # and append 4 to it, however this change is not propagated through the manager,
    # as it's performed on an ordinary list with which the manager has no interaction
    d[1].append(4)
    # convert d to string via d.__str__() (see https://docs.python.org/2/reference/datamodel.html#object.__str__),
    # returning the "remote" string representation of the object (see https://docs.python.org/2/library/multiprocessing.html#multiprocessing.managers.SyncManager.list),
    # to which the change above was not propagated
    print d

if __name__ == '__main__':
    # invoke d.__setitem__(), propagating this assignment (mapping 1 to an empty list) through the manager
    d[1] = []
    p = Process(target=f)
    p.start()
    p.join()

在更新后,d[1]重新分配from multiprocessing import Process, Manager manager = Manager() d = manager.dict() def f(): # perform the exact same steps, as explained in the comments to the previous code snippet above, # but in addition, invoke d.__setitem__() with the changed item in order to propagate the change l = d[1] l.append(4) d[1] = l print d if __name__ == '__main__': d[1] = [] p = Process(target=f) p.start() p.join() 新列表,或者甚至再次使用相同的列表,会触发管理器传播更改:

d[1] += [4]

d[1] = []行也可以。

或者,Since Python 3.6this changesetthis issueuse nested Proxy Objects也可以different process creation mechanism自动将对其执行的任何更改传播到包含的代理对象。因此,用d[1] = manager.list()替换行from multiprocessing import Process, Manager manager = Manager() d = manager.dict() def f(): d[1].append(4) # the __str__() method of a dict object invokes __repr__() on each of its items, # so explicitly invoking __str__() is required in order to print the actual list items print({k: str(v) for k, v in d.items()} if __name__ == '__main__': d[1] = manager.list() p = Process(target=f) p.start() p.join() 也可以解决问题:

manager = Manager()

不幸的是,这个错误修复程序没有移植到 Python 2.7 (从 Python 2.7.13 开始)。

注意(在 Windows 操作系统下运行):

虽然描述的行为也适用于 Windows 操作系统,但由于CreateProcess() API rather than the fork() system call而在 Windows 下执行时附加的代码段将失败,依赖recommended,这是不受支持的。

每当通过多处理模块创建新进程时, Windows 会创建一个新的 Python 解释器进程,该进程可能导入主模块危险的副作用。为了避免这个问题,以下编程指南是this answer

  

确保新的 Python 解释器可以安全地导入主模块,而不会导致意外的副作用(例如启动新进程)。

因此,在 Windows 下执行附加的代码片段将尝试根据Manager行创建无限数量的进程。这可以通过在Manager.dict子句中创建if __name__ == '__main__'Manager.dict对象并将f()对象作为参数传递给mapView.showAnnotations(annotations, animated: true) 来轻松解决,如同this answer

有关此问题的更多详细信息,请参阅{{3}}。

答案 3 :(得分:2)

from multiprocessing import Process, Manager
manager = Manager()
d = manager.dict()
l=manager.list()

def f():
    l.append(4)
    d[1]=l
    print d

if __name__ == '__main__':
    d[1]=[]
    p = Process(target=f)
    p.start()
    p.join()