如何使用多处理加速此循环(pool.map)

时间:2015-08-19 05:20:28

标签: python python-3.x multiprocessing

我已经编写了这个循环来查找我拥有的一些数据集的所有组合。当iteration_depth = 3时,运行大约需要13分钟。我的笔记本电脑有两个核心,所以我想用multiprocessing来加速它,但由于我真的不知道我在做什么,语法/参数让我感到沮丧。

import multiprocessing

def FindAllAffordableLineups():
    all_rosters = []
    roster = [None]*8

    for n in range(int(iteration_depth)):
        for catcher in catcher_pool[0:n]:
            roster = [None]*8
            roster[0] = catcher[0]
            for first_baseman in first_pool[0:n]:
                roster[1] = first_baseman[0]
                for second_baseman in second_pool[0:n]:
                    roster[2] = second_baseman[0]
                    for third_baseman in third_pool[0:n]:
                        roster[3] = third_baseman[0]
                        for shortstop in short_pool[0:n]:
                            roster[4] = shortstop[0]
                            for outfielders in it.combinations(of_pool, 3):
                                roster[5:8] = outfielders[0][0], outfielders[1][0], outfielders[2][0]
                                salaryList = []
                                for player1 in roster:
                                    for player2 in player_pool:
                                        if player1 == player2[0]:
                                            salaryList.append(int(player2[3]))
                                if sum(salaryList) <= remaining_salary:
                                    if len(roster) == len(set(roster)):
                                        all_rosters.append(roster[:])
                                        if len(all_rosters) < 50:
                                            print('Number of possible rosters found: ',len(all_rosters))
                                        if len(all_rosters) == 50:
                                            print("Fifty affordable rosters were found. We're not displaying every time we find another one. That would slow us down a lot.")          
                                        salaryList = []
                                if len(all_rosters) > 10**6:
                                    writeRosters = open(os.path.join('Affordable Rosters.csv'), 'w', newline = '')
                                    csvWriter = csv.writer(writeRosters)
                                    for row in all_rosters:
                                        csvWriter.writerow(row)
                                    writeRosters.close()
                                    all_rosters = []
        writeRosters = open(os.path.join('Affordable Rosters.csv'), 'w', newline = '')
        csvWriter = csv.writer(writeRosters)
        for row in all_rosters:
            csvWriter.writerow(row)
        writeRosters.close()

pool = multiprocessing.Pool(processes=2)
r = pool.map(FindAllAffordableLineups())

这给了我

Traceback (most recent call last):
  File "C:\Users\Owner\Desktop\Multiprocessing\11 - Find Optimal Lineup.py", line 133, in <module>
    r = pool.map(FindAllAffordableLineups())
TypeError: map() missing 1 required positional argument: 'iterable'

在我看过的大多数例子中,定义的函数有一些需要在函数内部执行的参数,这是map.pool命令中的可迭代函数,但是我的函数不需要这个。我该如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

您的代码基本上是找到列表的笛卡尔积

[catcher_pool,first_pool,second_pool,third_pool,short_pool] + it.combinations(of_pool,3)

请参阅此帖子,了解如何做得很好:Get the cartesian product of a series of lists?

然后你可以创建一个辅助函数来过滤掉无效的阵容

这应该可以让你获得合理的加速,但是如果你仍想要并行化,那么我会利用代码的树状结构,顺序选择前几个玩家(给你一个元组的列表)前几个玩家),然后通过你的其余功能映射那些元组(并行)。