为什么导入在多处理中有成本?

时间:2014-12-03 11:01:28

标签: python import multiprocessing

基本上来自不同模块的导入越多,这些多处理任务所用的时间越长,即使没有使用任何模块功能。每个流程都必须重新导入所有内容吗?发生了什么事?

import time

time1 = time.time()

import multiprocessing as mp
import numpy as np  # Random imports (not used)
import PIL
import PySide
import pandas
# print time.time() - time1  # here this prints 0.0

class Multi(object):
    def __init__(self, queue):
        self.q = queue    
    def run(self, a):
        p = mp.Process(target=f, args=(a, q))
        p.start()
        print self.q.get()
        p.join()


class MultiPool(object):
    def __init__(self, N):
        self.N = N
        self.pool = mp.Pool(processes = self.N)    
    def run(self):
        result = self.pool.map_async(f1, ((i,) for i in range(self.N)))
        print result.get()


def f(a, q):
    for i in range(10000000):
        b = i
    q.put(b)

def f1(a):
    for i in range(10000000):
        b = i
    return b

if __name__ == '__main__':

    q = mp.Queue()
    e = Multi(q)

    # time1 = time.time()
    print f1(0)
    print time.time() - time1

    time1 = time.time()
    e.run('123')
    print time.time() - time1

    time1 = time.time()
    mpool = MultiPool(2)
    mpool.run()
    print time.time() - time1

# Output with random imports:
>9999999
>0.246000051498
>9999999
>0.693000078201
>[9999999, 9999999]
>0.720999956131

# Output without imports:
>9999999
>0.246000051498
>9999999
>0.315999984741
>[9999999, 9999999]
>0.313999891281

1 个答案:

答案 0 :(得分:1)

multiprocessing必须导入任何进程中的所有内容,因为进程(新应用程序)而非线程。

您将通过脚本衡量的是方法执行的成本加上流程创建的成本。您可以测量导入成本,并且它们恰好在import语句所在的位置执行。