我需要在复杂条件下合并两个数据帧。这里有两个数据帧:
dock_id dock_name avail_bikes avail_docks \
0 3082 Hope St & Union Ave 8 16
1 468 Broadway & W 55 St 0 59
2 407 Henry St & Poplar St 22 15
3 3016 Kent Ave & N 7 St 29 16
status_key datehour ... visi vism wdird wdire \
0 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
1 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
2 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
3 1 2016-06-01 19:25:00 ... NaN NaN NaN NaN
tot_docks _lat _long in_service
0 25 40.711674 -73.951413 1
1 59 40.765265 -73.981923 1
2 37 40.700469 -73.991454 1
3 47 40.720368 -73.961651 1
...
Start Date/Time End Date/Time Event Agency \
0 01/01/2016 12:00:00 AM 01/01/2016 02:00:00 AM Parks Department
1 01/02/2016 12:00:00 AM 01/02/2016 02:00:00 AM Parks Department
2 01/03/2016 12:00:00 AM 01/03/2016 02:00:00 AM Parks Department
3 01/04/2016 12:00:00 AM 01/04/2016 02:00:00 AM Parks Department
latitude longitude
0 40.782865 -73.965355
1 40.782865 -73.965355
2 40.782865 -73.965355
3 40.782865 -73.965355
4 40.782865 -73.965355
我想加入他们的条件:
Start Date/Time <= datehour <= End Date/Time and distance(_lat,_lon,latitude,longitude) < d
我知道可以合并数据然后在其上应用过滤器来执行此操作,但数据集太大(10263241行和401080行)。所以我认为这种方法不会在合理的时间内发挥作用。
您知道我们怎么能解决这个问题?
感谢您的回答!
答案 0 :(得分:0)
将pandas导入为pd ... new_frame = pd.merge(dataframe1,dataframe2,on condition)
如果是更高级的合并,我们也可以指定列名 数据帧[[ '列1', '列2',...]]