Question

要确定使用大部分计算时间的步骤，我运行了cProfile并得到以下结果：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.014    0.014  216.025  216.025 func_poolasync.py:2(<module>)
    11241  196.589    0.017  196.589    0.017 {method 'acquire' of 'thread.lock' objects}
      982    0.010    0.000  196.532    0.200 threading.py:309(wait)
     1000    0.002    0.000  196.498    0.196 pool.py:565(get)
     1000    0.005    0.000  196.496    0.196 pool.py:557(wait)
515856/3987    0.350    0.000   13.434    0.003 artist.py:230(stale)

显然大部分时间都花在了method 'acquire' of 'thread.lock' objects上。我没有使用线程；而不是我将pool.apply_async与几个处理器一起使用，所以我很困惑为什么thread.lock是问题？

我希望阐明为什么这是瓶颈？以及如何降低这次？

代码如下：

path='/usr/home/work'
filename='filename'

with open(path+filename+'/'+'result.pickle', 'rb') as f:
     pdata = pickle.load(f)

if __name__ == '__main__':
    pool = Pool()    
    results=[]
    data=list(range(1000))
    print('START')
    start_time = int(round(time.time()))
    result_objects = [pool.apply_async(func, args=(nomb,pdata[0],pdata[1],pdata[2])) for nomb in data]

    results = [r.get() for r in result_objects]

    pool.close()
    pool.join()
    print('END', int(round(time.time()))-start_time)

修订版：

通过从pool.apply_async切换到pool.map，我可以将执行时间减少约3倍。

输出：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.113    0.113   70.824   70.824 func.py:2(<module>)
     4329   28.048    0.006   28.048    0.006 {method 'acquire' of 'thread.lock' objects}
        4    0.000    0.000   28.045    7.011 threading.py:309(wait)
        1    0.000    0.000   28.044   28.044 pool.py:248(map)
        1    0.000    0.000   28.044   28.044 pool.py:565(get)
        1    0.000    0.000   28.044   28.044 pool.py:557(wait)

修改代码：

if __name__ == '__main__':
    pool = Pool()    
    data=list(range(1000))

    print('START')
    start_time = int(round(time.time()))
    funct = partial(func,pdata[0],pdata[1],pdata[2])
    results = pool.map(funct,data)
    print('END', int(round(time.time()))-start_time)

但是，已经发现一些迭代会导致无意义的结果。我不确定为什么会这样，但是可以看到速率确定步骤仍然是“ thread.lock”对象的“方法”获取。

Answer 1

探查器仅告诉您主要流程在做什么，而其子流程在完成所有工作。

来源：https://stackoverflow.com/a/22530665/1064565

> 90％的时间花费在'thread.lock'对象的方法'acquire'上

1 个答案: