我有一个类A
,该类在启动时会更改可变的类属性nums
。
当通过带有maxtasksperchild
= 1
的进程池启动类时,我注意到nums
具有几个不同进程的值。对我来说这是不想要的行为。
我的问题是:
maxtasksperchild
和进程池的工作原理吗? 编辑:我猜想池对它启动的先前进程(而不是原始进程)进行了酸洗,从而保存了nums
的值,对吗?如果可以,我如何强制它使用原始流程?
下面是示例代码:
from multiprocessing import Pool
class A:
nums = []
def __init__(self, num=None):
self.__class__.nums.append(num) # I use 'self.__class__' for the sake of explicitly
print(self.__class__.nums)
assert len(self.__class__.nums) < 2 # checking that they don't share memory
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
编辑,因为k.wahome的回答:使用实例属性不能回答我的问题,我需要使用类属性,因为在我的原始代码(此处未显示)中,每个进程有多个实例。我的问题特别是关于多处理池的工作原理。
顺便说一句,请执行以下操作
from multiprocessing import Process
if __name__ == '__main__':
prs = []
for i in range(99):
pr = Process(target=A, args=[i])
pr.start()
prs.append(pr)
[pr.join() for pr in prs]
# the assert was not raised
答案 0 :(得分:0)
共享最有可能通过具有类属性<nav>
<ul>
<li id="linkTop">
<a href="#top">Home</a>
</li>
<li id="linkAbout">
<a href="#about">About Us</a>
</li>
<li id="linkServices">
<a href="#services">Services</a>
</li>
<li id="linkClients">
<a href="#clients">Clients</a>
</li>
<li id="linkContact">
<a href="#contact">Contact</a>
</li>
</ul>
</nav>
的映射类A
进入。
Class属性是类绑定的,因此属于类本身,是在加载类时创建的,并且它们将由所有实例共享。所有对象都将对类属性具有相同的内存引用。
与类属性不同,实例属性是实例绑定的,不会被各种实例共享。每个实例都有自己的instance属性副本。
查看类vs实例属性的效果:
1。使用nums
作为类属性 class_num.py
nums
运行此脚本
from multiprocessing import Pool
class A:
nums = []
def __init__(self, num=None):
# I use 'self.__class__' for the sake of explicitly
self.__class__.nums.append(num)
print("nums:", self.__class__.nums)
# checking that they don't share memory
assert len(self.__class__.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
print(pool)
pool.map(A, range(99)) # the assert is being raised
2。使用>>> python class_num.py
nums: [0]
nums: [0, 1]
nums: [4]
nums: [4, 5]
nums: [8]
nums: [8, 9]
nums: [12]
nums: [12, 13]
nums: [16]
nums: [16, 17]
nums: [20]
nums: [20, 21]
nums: [24]
nums: [24, 25]
nums: [28]
nums: [28, 29]
nums: [32]
nums: [32, 33]
nums: [36]
nums: [36, 37]
nums: [40]
nums: [40, 41]
nums: [44]
nums: [44, 45]
nums: [48]
nums: [48, 49]
nums: [52]
nums: [52, 53]
nums: [56]
nums: [56, 57]
nums: [60]
nums: [60, 61]
nums: [64]
nums: [64, 65]
nums: [68]
nums: [68, 69]
nums: [72]
nums: [72, 73]
nums: [76]
nums: [76, 77]
nums: [80]
nums: [80, 81]
nums: [84]
nums: [84, 85]
nums: [88]
nums: [88, 89]
nums: [92]
nums: [92, 93]
nums: [96]
nums: [96, 97]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "class_num.py", line 12, in __init__
assert len(self.__class__.nums) < 2
AssertionError
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "class_num.py", line 18, in <module>
pool.map(A, range(99)) # the assert is being raised
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 608, in get
raise self._value
AssertionError
作为实例属性 instance_num.py
nums
运行此脚本
from multiprocessing import Pool
class A:
def __init__(self, num=None):
self.nums = []
if num is not None:
self.nums.append(num)
print("nums:", self.nums)
# checking that they don't share memory
assert len(self.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
答案 1 :(得分:0)
您的观察还有另一个原因。 nums
中的值不是来自其他进程,而是来自相同进程,当它开始托管A的多个实例时,发生这种情况是因为您没有将chunksize
设置为1您的pool.map
通话。
在您的情况下,仅设置maxtasksperchild=1
是不够的,因为一个任务仍然消耗了整个可迭代的块。
此方法将迭代器切成多个块,作为单独的任务提交给进程池。这些块的(大约)大小可以通过将chunksize设置为正整数来指定。 docs about map