我具有以下要传递给pool.map的函数(some_function)。该代码可以工作,但是比串行处理要慢得多。值得注意的是,我有400万个分组对象,因此渴望并行。
我认为减速是因为dict_dfs(数据帧字典)正在共享。因此,如果有5个核可用,则并行处理的五个组可能需要访问数据帧字典中的相同数据帧。我的推理正确吗?如果是这样,那我该如何克服这个问题?
UITableView
didSelectRow
def some_function(grp_list, cm, dict_dfs):
"""
This function takes a single group object from the grp_list, performs
some operations on it and returns the grouped object as a dataframe.
Args:
cm = float64
dict_dfs = a dictionary of dataframes, one of which is accessed and some
operations on it.
grp_list = a list of grouped objects saved to a list
Return:
returns the grp_list as a dataframe after performing some operations
"""
# Operates on the grp_list and does something cool
return (df)