为什么这个线程实例比列表理解慢?

时间:2016-09-23 15:33:22

标签: python multithreading list set multiprocessing

我有一个函数可以在事务列表中返回给定项目集的"support"项目列表在下面的行中显示的频率: def count(pair_list):

def support_tuple(items):
    count = float(sum([1 for row in rows_tuple if (items in row)]))
    supp = count/n_rows
    return (items, supp)

if __name__ == "__main__":
    from multiprocessing.dummy import Pool as ThreadPool
    import multiprocessing as mp

    pairs = [('apple', 'banana'), ('cookie', 'popsicle'), ('candy', 'cookie'), ...]

    # grocery transaction data
    rows_tuple = [{('margarine', 'margarine'), ('citrus', 'semi-finished'), ('bread', 'bread'), ('citrus', 'citrus')}, {('bread', 'fruit'), ('citrus', 'margarine'), ('ready', 'bread'), ('semi-finished', 'fruit'), ('soups', 'margarine'), ('margarine', 'soups')}, {('fruit', 'margarine'), ... }]

    res_list_comprehension = [support_tuple(pair) for pair in pairs]

    threadpool = ThreadPool(mp.cpu_count())
    res_threading = threadpool.map(support_tuple, pairs, chunksize = 100)

实际上,rows_tuple的长度为18000,pairs的长度为9000,但我的主要问题是为什么列表理解在这种情况下优于线程?我是否完全错过了可以大大提高速度的线程?

0 个答案:

没有答案