Question

我制作了一个程序，用于通过将列表划分为子部分并在Python中使用多处理来添加列表。我的代码如下：

from concurrent.futures import ProcessPoolExecutor, as_completed
import random
import time

def dummyFun(l):
    s=0
    for i in range(0,len(l)):
        s=s+l[i]
    return s


def sumaSec(v):
    start=time.time()
    sT=0
    for k in range(0,len(v),10):
        vc=v[k:k+10]
        print ("vector ",vc)
        for item in vc:
            sT=sT+item
        print ("sequential sum result ",sT)
        sT=0
    start1=time.time()
    print ("sequential version time ",start1-start)


def main():
    workers=5
    vector=random.sample(range(1,101),100)
    print (vector)
    sumaSec(vector)
    dim=10
    sT=0
    for k in range(0,len(vector),dim):
        vc=vector[k:k+dim]
        print (vc)
        for item in vc:
            sT=sT+item
        print ("sub list result ",sT)
        sT=0

    chunks=(vector[k:k+dim] for k in range(0,len(vector),10))
    start=time.time()
    with ProcessPoolExecutor(max_workers=workers) as executor:
        futures=[executor.submit(dummyFun,chunk) for chunk in chunks]
    for future in as_completed(futures):
        print (future.result())
    start1=time.time()
    print (start1-start)

if __name__=="__main__":
    main()

问题在于，对于顺序版本，我有时间：

0.0009753704071044922

同时版本的时间是：

0.10629010200500488

当我将工人人数减少到2时，我的时间是：

0.08622884750366211

为什么会这样？

谢谢

Answer 1

向量的长度只有100。这是非常小的工作量，因此启动进程池的固定成本是运行时最重要的部分。因此，当有很多工作要做时，并行性是最有益的。尝试使用更大的向量，例如一百万的长度。

第二个问题是您要每个工人做少量工作：一块10号大小。同样，这意味着启动任务的成本不能因这么少的工作而摊销。使用更大的块。例如，使用int(len(vector)/(workers*10))代替max_workers=None。

还要注意，您正在创建5个进程。对于这样的CPU密集型任务，理想情况下，您希望使用与具有物理CPU内核相同数量的进程。请使用系统具有的内核数量，或者使用ProcessPoolExecutor（默认值），则select User_ID, `date`, min(case when weekday(`date`) = 0 then `In` end) monday_in, max(case when weekday(`date`) = 0 then `Out` end) monday_out, min(case when weekday(`date`) = 1 then `In` end) tuesday_in, max(case when weekday(`date`) = 1 then `Out` end) tuesday_out, min(case when weekday(`date`) = 2 then `In` end) wednesday_in, max(case when weekday(`date`) = 2 then `Out` end) wednesday_out, min(case when weekday(`date`) = 3 then `In` end) thursday_in, max(case when weekday(`date`) = 3 then `Out` end) thursday_out, min(case when weekday(`date`) = 4 then `In` end) friday_in, max(case when weekday(`date`) = 4 then `Out` end) friday_out, min(case when weekday(`date`) = 5 then `In` end) saturday_in, max(case when weekday(`date`) = 5 then `Out` end) saturday_out, min(case when weekday(`date`) = 6 then `In` end) sunday_in, max(case when weekday(`date`) = 6 then `Out` end) sunday_out from attendance where `date` between '2019-11-04' and '2019-11-10' group by User_ID, `date` order by User_ID, `date`将默认为该系统的内核数量。如果使用的进程太少，那么您将在表上留下性能；如果使用的进程太多，那么CPU将不得不在它们之间进行切换，并且性能可能会受到影响。

Answer 2

您的分块对于创建多个任务非常糟糕。即使已经创建了工人，创建太多任务仍然会浪费时间。

也许这篇文章可以帮助您进行搜索： How to parallel sum a loop using multiprocessing in Python

为什么这个多进程程序比非并发版本运行慢？

2 个答案: