命名为多处理锁

时间:2019-06-28 20:47:14

标签: python

我的目标是从多个试图锁定目录的进程中锁定目录。 我需要在进程基于唯一的字符串标识符分支之后创建锁对象。通过阅读文档,我在多重处理模块中找不到任何解决方案。

以下是演示问题的代码:

from multiprocessing import Lock, Process
import os
import time


class NamedResource:

    def __init__(self, identifier: str):
        self.identifier = identifier
        self.lock = Lock()

def work_on(name: str):
    nr = NamedResource(name)
    with nr.lock:
        print(time.time() % 100 // 1, os.getpid(), f"working on {nr.identifier}")
        time.sleep(5)
        print(time.time() % 100 // 1,os.getpid(), f"done with {nr.identifier}")

res = ['a', 'b']
processes = []

for r in res:
    p1 = Process(target=work_on, args=(r,))
    p1.start()

    p2 = Process(target=work_on, args=(r,))
    p2.start()

    processes += [p1, p2]

for p in processes:
    p.join()

显然,锁定对象是不同的,并且它们不会阻止访问-从输出中可以看出:

67.0 48689 working on b
67.0 48690 working on b
67.0 48688 working on a
67.0 48687 working on a
72.0 48689 done with b
72.0 48690 done with b
72.0 48688 done with a
72.0 48687 done with a

正确的输出将是每个[a,b]上只有一个进程,但是一次有两个进程(因为有两种不干扰的资源)。

最后,我通过将一个隐藏文件写入要限制访问的目录,并在另一个过程中基于名称检查此文件的存在来实现此目的。这不是很快,并且会在检查和写入文件之间的较短时间内创建竞争条件。有更好的方法吗?

1 个答案:

答案 0 :(得分:0)

单个Lock对象只能输入一次。代码的问题在于您只是在创建许多独立的锁,因此进程不会互相排斥,因为它们都在锁定/解锁自己的锁

关于使事情进展顺利的最好方法是使用自定义Manager公开单个NamedResource

from collections import defaultdict
from multiprocessing import Lock, Process
from multiprocessing.managers import SyncManager

class NamedResource:
    def __init__(self):
        # each client process is served by a seperate thread,
        # this lock is used to serialize access to our state
        self.lock = Lock()
        self.locks = defaultdict(Lock)

    def acquire(self, name):
        with self.lock:
            lock = self.locks[name]
        lock.acquire()

    def release(self, name):
        with self.lock:
            lock = self.locks[name]
        lock.release()

class MyManager(SyncManager):
    pass

MyManager.register('NamedResource', NamedResource)

我们可以将您的work_on更改为像这样使用:

def work_on(name: str, locker: NamedResource):
    locker.acquire(name)
    try:
        print(f"{time.time() % 60:.3f} {os.getpid()} working on {name}")
        time.sleep(0.3)
        print(f"{time.time() % 60:.3f} {os.getpid()} done with {name}")
    finally:
        locker.release(name)

如果您愿意,您可以创建一个不错的上下文管理器来包装acquirerelease

然后我们通过以下操作将所有内容放在一起:

res = ['a', 'b']
processes = []

with MyManager() as manager:
    locker = manager.NamedResource()

    for r in res:
        p1 = Process(target=work_on, args=(r, locker))
        p1.start()

        p2 = Process(target=work_on, args=(r, locker))
        p2.start()

        processes += [p1, p2]

    for p in processes:
        p.join()

如果您想要上面提到的不错的上下文管理器,我建议使用类似的东西

from contextlib import contextmanager
from multiprocessing.managers import BaseProxy

class NamedResourceProxy(BaseProxy):
    _exposed_ = ('acquire', 'release')

    def acquire(self, name: str):
        return self._callmethod('acquire', (name,))

    def release(self, name: str):
        return self._callmethod('release', (name,))

    @contextmanager
    def use(self, name: str):
        self.acquire(name)
        try:
            yield None
        finally:
            self.release(name)

,然后将对register的呼叫更改为:MyManager.register('NamedResource', NamedResource, NamedResourceProxy)

work_on可能类似于:

def work_on(name: str, locker: NamedResource):
    with locker.use(name):
        print(f"{time.time() % 60:.3f} {os.getpid()} working on {name}")
        time.sleep(0.3)
        print(f"{time.time() % 60:.3f} {os.getpid()} done with {name}")