使用pool.map的多线程比正常的单个进程花费更长的时间

时间:2016-11-04 20:10:19

标签: python multithreading threadpool

我想使用python并行化任务,所以我读到了pool.map,其中数据被分成多个块并由每个进程(线程)处理。 我有一个庞大的字典(200万字)和一个句子的文本文件,其思路是将句子分成单词并将每个单词与现有字典相匹配,并根据返回结果进行进一步处理。在这之前,我编写了一个虚拟程序来检查pool.map的功能,但它没有按预期工作(即单个进程比多个进程花费的时间更少)(我可以互换使用进程和线程,因为我认为每个线程都没有但这是一个过程)

def add_1(x):
    return (x*x+x)

def main():
    iter = 10000000
    num = [i for i in xrange(iter)]
    threads = 4
    pool = ThreadPool(threads)  
    start = time.time()
    results = pool.map(add_1,num,iter/threads)
    pool.close()
    pool.join()
    end = time.time()
    print('Total Time Taken = %f')% (end-start)

总时间:

Thread 1    Total Time Taken = 2.252361
Thread 2    Total Time Taken = 2.560798
Thread 3    Total Time Taken = 2.938640
Thread 4    Total Time Taken = 3.048179

Just using  pool = ThreadPool()
def main:

   num = [i for i in xrange(iter)]
   #pool = ThreadPool(threads)
   pool = ThreadPool()
   start = time.time()
   #results = pool.map(add_1,num,iter/threads)
   results = pool.map(add_1,num)
   pool.close()
   pool.join()
   end = time.time()
   print('Total Time Taken = %f')% (end-start)

总时间= 3.031125

循环执行正常:

def main():

    iter = 10000000
    start = time.time()
    for k in xrange(iter):
        add_1(k)
    end = time.time()
    print ('Total Time normally = %f') % (end-start)

总时间通常= 1.087591

配置: 我使用的是python 2.7.6

0 个答案:

没有答案
相关问题