这是我的代码,正在运行,但是有点慢。
def fsl(Df,p):
NewDf = Df.copy()
if p=='group1':
try:
del NewDf['sl']
del NewDf['Y2030']
except:
pass
selected_clusters = NewDf.loc[(NewDf['Group']==p) & (NewDf['Selected']=='Y'),'Clusters'].tolist()
for i in selected_clusters:
x = 0
surplus = calc_surplus(x,i,p)
while (surplus > 0) and (x < 100000):
x += 100
surplus = calc_surplus(x,i,p)
NewDf.loc[(NewDf['Clusters']==i) & (NewDf['Group']==p),'sl']=x
if p=='group1':
NewDf['sl'] = NewDf['sl'].fillna(0)
return NewDf
我希望每个surplus
的{{1}}可以并行计算,以加快处理速度。
我将那些用于并行运行的代码移到了一个新函数上,并试图像这样在selected_cluster
上运行它:
Multiprocessing.Pool
问题是,调用功能def parallel(i):
x = 0
surplus = calc_surplus(x,i,p)
while (surplus > 0) and (x < 100000):
x += 100
surplus = calc_surplus(x,i,p)
NewDf.loc[(NewDf['Clusters']==i) & (NewDf['Group']==p),'sl']=x
if p=='group1':
NewDf['sl'] = NewDf['sl'].fillna(0)
def fsl(Df,p):
NewDf = Df.copy()
if p=='group1':
try:
del NewDf['sl']
del NewDf['Y2030']
except:
pass
selected_clusters = NewDf.loc[(NewDf['Group']==p) & (NewDf['Selected']=='Y'),'Clusters'].tolist()
if __name__ == '__main__':
with Pool(4) as pool:
pool.map(parallel,[i for i in selected_clusters])
return NewDf
时,功能parallel
永远不会运行。从未创建列fsl
。我认为错误出在sl
或pool.map
中,但我似乎真的无法解决。
我已经在Multiprocessing上看到了其他线程,但是其中大多数并不完全适用于此。我的代码有什么问题?