将多个可迭代对象传递给multiprocessing pool.map

时间:2018-08-26 03:18:42

标签: python multiprocessing

我具有以下要传递给pool.map的函数(some_function)。该代码可以工作,但是比串行处理要慢得多。值得注意的是,我有400万个分组对象,因此渴望并行。

我认为减速是因为dict_dfs(数据帧字典)正在共享。因此,如果有5个核可用,则并行处理的五个组可能需要访问数据帧字典中的相同数据帧。我的推理正确吗?如果是这样,那我该如何克服这个问题?

地图功能

UITableView

准备数据框的grp_list和字典

didSelectRow

创建部分函数并运行pool.map

def some_function(grp_list, cm, dict_dfs):
    """
    This function takes a single group object from the grp_list, performs 
    some operations on it and returns the grouped object as a dataframe.

    Args:
    cm = float64
    dict_dfs = a dictionary of dataframes, one of which is accessed and some 
    operations on it.
    grp_list = a list of grouped objects saved to a list

    Return:
    returns the grp_list as a dataframe after performing some operations

    """

    # Operates on the grp_list and does something cool

    return (df)

0 个答案:

没有答案