Question

我需要对生产环境（AWS）中大量项目的每个元素进行昂贵的计算。在一个用例中，列表具有~50k个元素，每个计算需要~500ms。它们各不相同，但没有一个应该超过1.5秒。输入和计算都会随着时间的推移而变化，因此需要定期进行。我已经使用多处理功能成功地完成了任务，对于更小的列表，我看到了计算时间内预期的，接近线性的加速。但是，我尝试使用每个处理多处理的配方，我发现最快处理的持续时间和最慢的处理之间存在很大的差异。例如，我可能会这样做：

def worker(compute_func, queue_in, queue_out):
    while True:
        queue_out.put(compute_func(queue_in.get(True)))
        queue_in.task_done()


def multiprocess_task(input):
    # divide by 2 because using more than the number of cores 
    # doesn't actually speed anything up
    num_procs = multiprocessing.cpu_count() / 2
    queue_in = multiprocessing.JoinableQueue()
    queue_out = multiprocessing.Queue()
    pool = multiprocessing.Pool(num_procs, worker, (compute_func, queue_in, queue_out))

    for item in input:
        queue_in.put(item)

    queue_in.join()
    pool.close()

    return process_output(queue_out)

我也使用pool.map和pool.imap_unordered完成了这项工作，既可以将函数映射到输入列表，也可以手动将输入分解为块和映射，这样每个进程只能在一个块上运行（对于我的计算的修改版本），对于imap_unordered，块大小为1和其他数字（10,100）。我每次都得到相同的结果：

有一段时间，top看起来像这样：

  4171 george    20   0 7191344 6.439g   2380 R 100.1 10.9   6:29.44 python                                                                                                                                                             
  4172 george    20   0 7190064 6.438g   2380 R 100.1 10.9   6:21.86 python                                                                                                                                                             
  4173 george    20   0 7202608 6.450g   2388 R 100.1 10.9   6:30.23 python                                                                                                                                                             
  4178 george    20   0 7200300 6.447g   2388 R 100.1 10.9   6:30.30 python                                                                                                                                                             
  4180 george    20   0 7190060 6.438g   2380 R 100.1 10.9   6:29.72 python                                                                                                                                                             
  4181 george    20   0 7191596 6.439g   2436 R 100.1 10.9   6:30.60 python                                                                                                                                                             
  4187 george    20   0 7191084 6.439g   2388 R 100.1 10.9   6:30.55 python                                                                                                                                                             
  4188 george    20   0 7190316 6.438g   2380 R 100.1 10.9   6:31.06 python                                                                                                                                                             
  4174 george    20   0 7190320 6.438g   2380 R  99.8 10.9   6:29.87 python                                                                                                                                                             
  4175 george    20   0 7190576 6.438g   2388 R  99.8 10.9   6:29.71 python                                                                                                                                                             
  4176 george    20   0 7190576 6.438g   2380 R  99.8 10.9   6:22.57 python                                                                                                                                                             
  4177 george    20   0 7190832 6.439g   2388 R  99.8 10.9   6:30.19 python                                                                                                                                                             
  4179 george    20   0 7192364 6.440g   2380 R  99.8 10.9   6:30.22 python                                                                                                                                                             
  4182 george    20   0 7191340 6.439g   2380 R  99.8 10.9   6:30.23 python                                                                                                                                                             
  4183 george    20   0 7190316 6.438g   2380 R  99.8 10.9   6:30.06 python                                                                                                                                                             
  4184 george    20   0 7203372 6.451g   2436 R  99.8 10.9   6:30.90 python                                                                                                                                                             
  4185 george    20   0 7204652 6.452g   2336 R  99.8 10.9   6:30.99 python                                                                                                                                                             
  4186 george    20   0 7190060 6.438g   2388 R  99.8 10.9   6:30.90 python

大约20分钟后，只有一半的进程仍在运行。 40分钟后，仍然有一个python进程在运行。

我的印象是，只要进程可用，上面粘贴的imap_unordered和Queue配方就会将进程应用于下一个输入元素，所以我不明白怎么可能发生。

worker不读取或写入磁盘，但compute_func是父进程加载到内存中的对象的Cython定义类方法的partial。

作为最后一点，我是并行化的新手，所以如果我错误地使用了任何词汇，请原谅（并且正确！）我。

感谢您的帮助！

在多处理过程中，一些过程比其他过程要长得多

0 个答案: