我的目标是从多个试图锁定目录的进程中锁定目录。 我需要在进程基于唯一的字符串标识符分支之后创建锁对象。通过阅读文档,我在多重处理模块中找不到任何解决方案。
以下是演示问题的代码:
from multiprocessing import Lock, Process
import os
import time
class NamedResource:
def __init__(self, identifier: str):
self.identifier = identifier
self.lock = Lock()
def work_on(name: str):
nr = NamedResource(name)
with nr.lock:
print(time.time() % 100 // 1, os.getpid(), f"working on {nr.identifier}")
time.sleep(5)
print(time.time() % 100 // 1,os.getpid(), f"done with {nr.identifier}")
res = ['a', 'b']
processes = []
for r in res:
p1 = Process(target=work_on, args=(r,))
p1.start()
p2 = Process(target=work_on, args=(r,))
p2.start()
processes += [p1, p2]
for p in processes:
p.join()
显然,锁定对象是不同的,并且它们不会阻止访问-从输出中可以看出:
67.0 48689 working on b
67.0 48690 working on b
67.0 48688 working on a
67.0 48687 working on a
72.0 48689 done with b
72.0 48690 done with b
72.0 48688 done with a
72.0 48687 done with a
正确的输出将是每个[a,b]上只有一个进程,但是一次有两个进程(因为有两种不干扰的资源)。
最后,我通过将一个隐藏文件写入要限制访问的目录,并在另一个过程中基于名称检查此文件的存在来实现此目的。这不是很快,并且会在检查和写入文件之间的较短时间内创建竞争条件。有更好的方法吗?
答案 0 :(得分:0)
单个Lock
对象只能输入一次。代码的问题在于您只是在创建许多独立的锁,因此进程不会互相排斥,因为它们都在锁定/解锁自己的锁
关于使事情进展顺利的最好方法是使用自定义Manager
公开单个NamedResource
:
from collections import defaultdict
from multiprocessing import Lock, Process
from multiprocessing.managers import SyncManager
class NamedResource:
def __init__(self):
# each client process is served by a seperate thread,
# this lock is used to serialize access to our state
self.lock = Lock()
self.locks = defaultdict(Lock)
def acquire(self, name):
with self.lock:
lock = self.locks[name]
lock.acquire()
def release(self, name):
with self.lock:
lock = self.locks[name]
lock.release()
class MyManager(SyncManager):
pass
MyManager.register('NamedResource', NamedResource)
我们可以将您的work_on
更改为像这样使用:
def work_on(name: str, locker: NamedResource):
locker.acquire(name)
try:
print(f"{time.time() % 60:.3f} {os.getpid()} working on {name}")
time.sleep(0.3)
print(f"{time.time() % 60:.3f} {os.getpid()} done with {name}")
finally:
locker.release(name)
如果您愿意,您可以创建一个不错的上下文管理器来包装acquire
和release
然后我们通过以下操作将所有内容放在一起:
res = ['a', 'b']
processes = []
with MyManager() as manager:
locker = manager.NamedResource()
for r in res:
p1 = Process(target=work_on, args=(r, locker))
p1.start()
p2 = Process(target=work_on, args=(r, locker))
p2.start()
processes += [p1, p2]
for p in processes:
p.join()
如果您想要上面提到的不错的上下文管理器,我建议使用类似的东西
from contextlib import contextmanager
from multiprocessing.managers import BaseProxy
class NamedResourceProxy(BaseProxy):
_exposed_ = ('acquire', 'release')
def acquire(self, name: str):
return self._callmethod('acquire', (name,))
def release(self, name: str):
return self._callmethod('release', (name,))
@contextmanager
def use(self, name: str):
self.acquire(name)
try:
yield None
finally:
self.release(name)
,然后将对register
的呼叫更改为:MyManager.register('NamedResource', NamedResource, NamedResourceProxy)
您work_on
可能类似于:
def work_on(name: str, locker: NamedResource):
with locker.use(name):
print(f"{time.time() % 60:.3f} {os.getpid()} working on {name}")
time.sleep(0.3)
print(f"{time.time() % 60:.3f} {os.getpid()} done with {name}")