我找到了这个很好的函数pandas.merge_asof
。
从文档
pandas.merge_asof(left, right, on=None, left_on=None, right_on=None)
Parameters:
left : DataFrame
right : DataFrame
on : label
Field name to join on. Must be found in both DataFrames.
The data MUST be ordered.
Furthermore this must be a numeric column,such as datetimelike, integer, or float.
On or left_on/right_on must be given.
它按预期工作。
但是,我合并的数据框只保留on
列,而不是最初位于left
的列 mydf=pandas.merge_asof(left, right, on='Time')
。我需要保留它们两者,所以要
mydf
和Time
包含来自left
和right
的{{1}}
示例数据:
a=pd.DataFrame(data=pd.date_range('20100201', periods=100, freq='6h3min'),columns=['Time'])
b=pd.DataFrame(data=
pd.date_range('20100201', periods=24, freq='1h'),columns=['Time'])
b['val']=range(b.shape[0])
out=pd.merge_asof(a,b,on='Time',direction='forward',tolerance=pd.Timedelta('30min'))
答案 0 :(得分:6)
我认为一种可能的解决方案是重命名列:
out = pd.merge_asof(a.rename(columns={'Time':'Time1'}),
b.rename(columns={'Time':'Time2'}),
left_on='Time1',
right_on='Time2',
direction='forward',
tolerance=pd.Timedelta('30min'))
print (out.head())
Time1 Time2 val
0 2010-02-01 00:00:00 2010-02-01 0.0
1 2010-02-01 06:03:00 NaT NaN
2 2010-02-01 12:06:00 NaT NaN
3 2010-02-01 18:09:00 NaT NaN
4 2010-02-02 00:12:00 NaT NaN