在了解python的多处理模块时,我注意到使用超过实际数量的多个进程运行程序,物理CPU会导致执行速度加快。为什么呢?
这是我的测试代码:
from multiprocessing import Pool, cpu_count
import time
def my_func(to):
out = 0
for n in range(1, to):
out += n ** n
return out
def main():
cpus = cpu_count()
print 'CPU count: %i' % cpus
run_args = [4000 * 12]
for processes in [cpus, cpus * 2, cpus * 3, cpus * 4]:
start = time.time()
workers = Pool(processes=processes)
results = workers.imap_unordered(my_func, run_args)
for _ in results:
pass
elapsed = time.time()-start
print 'procs: %i, time: %s secs' % (processes, elapsed)
if __name__ == '__main__':
main()
我机器上的输出是:
CPU count: 8
procs: 8, time: 6.22111010551 secs
procs: 16, time: 5.89230799675 secs
procs: 24, time: 5.81976008415 secs
procs: 32, time: 5.86776208878 secs
我一直认为使用超过物理CPU数量的多个进程都没用,但这表明我错了。有人热衷于解释吗?感谢
答案 0 :(得分:1)
CPU可能能够并行化一些东西。 (例如,CPU可以一步处理16+位。
我实际上学会了不要担心太多线程。有些程序有数千个线程,CPU现在很擅长并行化。