使用锁定或管理器列表进行Python多处理,以便池工作者访问全局列表变量

时间:2016-08-10 20:50:21

标签: python locking multiprocessing

我正在尝试在多个CUDA设备上分配作业,其中任何时候运行的作业总数应小于或等于可用的cpu核心数。为此,我确定了可用的插槽的数量'在每个设备上创建一个包含可用插槽的列表。如果我有6个cpu核心和两个cuda设备(0和1),那么AVAILABLE_SLOTS = [0,1,0,1,0,1]。在我的worker函数中,我弹出列表并将其保存到变量中,在子进程调用中设置CUDA_VISIBLE_DEVICES env var,然后将其追加到列表中。到目前为止,这一直有效,但我想避免竞争条件。

目前的代码如下:

def work(cmd):
    slot = AVAILABLE_GPU_SLOTS.pop()
    exit_code = subprocess.call(cmd, shell=False, env=dict(os.environ, CUDA_VISIBLE_DEVICES=str(slot)))
    AVAILABLE_GPU_SLOTS.append(slot)
    return exit_code

if __name__ == '__main__':
    pool_size = multiprocessing.cpu_count()
    mols_to_be_run = [name for name in os.listdir(YANK_FILES) if os.path.isdir(os.path.join(YANK_FILES, name))]
    cmds = build_cmd(mols_to_be_run)
    cuda = get_cuda_devices()
    AVAILABLE_GPU_SLOTS = build_available_gpu_slots(pool_size, cuda)
    pool = multiprocessing.Pool(processes=pool_size, maxtasksperchild=2, )
    pool.map(work, cmds)

我可以在与AVAILABLE_GPU_SLOTS相同的级别声明lock = multiprocessing.Lock(),将其放入cmds,然后在work()中执行

with lock:
    slot = AVAILABLE_GPU_SLOTS.pop()
# subprocess stuff
with lock:
    AVAILABLE_GPU_SLOTS.append(slot)

还是需要经理列表?或者也许可以更好地解决我正在做的事情。

1 个答案:

答案 0 :(得分:0)

基于我在以下SO回答Python sharing a lock between processes中找到的内容:

使用常规列表会导致每个进程都有自己的副本,如预期的那样。使用经理列表似乎足以解决这个问题。示例代码:

def doing_work(honk):
    proc = multiprocessing.current_process()
    # with lock:
    #     print proc, 'about to pop SLOTS_LIST', SLOTS_LIST
    #     slot = SLOTS_LIST.pop()
    #     print multiprocessing.current_process(), ' just popped', slot, 'from', SLOTS_LIST
    print proc, 'about to pop SLOTS_LIST', SLOTS_LIST
    slot = SLOTS_LIST.pop()
    print multiprocessing.current_process(), ' just popped', slot, 'from SLOTS_LIST'
    time.sleep(10)

def init(l):
    global lock
    lock = l

if __name__ == '__main__':
    man = multiprocessing.Manager()
    SLOTS_LIST = [1,34,3465,456,4675,6,4]
    SLOTS_LIST = man.list(SLOTS_LIST)
    l = multiprocessing.Lock()
    pool = multiprocessing.Pool(processes=2, initializer=init, initargs=(l,))
    inputs = range(len(SLOTS_LIST))
    pool.map(doing_work, inputs)

输出

<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675, 6, 4]
<Process(PoolWorker-3, started daemon)>  just popped 4 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675, 6]
<Process(PoolWorker-2, started daemon)>  just popped 6 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675]
<Process(PoolWorker-3, started daemon)>  just popped 4675 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456]
<Process(PoolWorker-2, started daemon)>  just popped 456 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465]    
<Process(PoolWorker-3, started daemon)>  just popped 3465 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34]
<Process(PoolWorker-2, started daemon)>  just popped 34 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1]
<Process(PoolWorker-3, started daemon)>  just popped 1 from SLOTS_LIST

这是期望的行为。我不确定它是否完全消除了竞争条件,但它似乎已经足够好了。那并且在它上面使用锁是很简单的。