我正在创建多个进程,以加快在api命中后将邮政编码写入数据框的速度。对于大约1100个请求,该过程需要8分钟以上的时间,而无需进行多处理。通过多处理,它可以在不到2分钟的时间内打印出邮政编码,但是当我尝试将这些值添加到5个单独的数据框中时,这些数据框总是空白。我认为最有可能在声明数据帧时丢失了一些内容,但我不确定。这是代码。
import multiprocessing
vehicles_free['zone'] = vehicles_free['Lat_Lon'].apply(lambda x:
reverse_geocode(x))
# this works fine but takes 8 minutes
len_vehicles_free = vehicles_free.shape[0]
free_zones1 = pd.DataFrame()
free_zones2 = pd.DataFrame()
free_zones3 = pd.DataFrame()
def series1():
global free_zones1
free_zones1['zone'] = c
def series2():
global free_zones2
free_zones2['zone'] = vehicles_free['Lat_Long'][int(np.round(len_vehicles_free/5))+1:int(2*np.round(len_vehicles_free/5))].apply(lambda x: reverse_geocode(x))
def series3():
global free_zones3
free_zones3['zone'] = vehicles_free['Lat_Long'][int(np.round(2*len_vehicles_free/5))+1:int(np.round(3*len_vehicles_free/5))].apply(lambda x: reverse_geocode(x))
p1 = multiprocessing.Process(target=series1)
p2 = multiprocessing.Process(target=series2)
p3 = multiprocessing.Process(target=series3)
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
# all the free_zones1,.....,free_zones5 DF are blank after this. What's Wrong here??
然后我将所有df最终合并为一个。