我有两个2个熊猫数据帧:
timestamp1 = ['2018-10-01 00:01:49.800000000', '2018-10-01 00:01:52.900000000', '2018-10-01 00:04:18.857741600']
df1 = pd.DataFrame(timestamp1, columns =['timestamp'])
timestamp2 = [['2018-10-01 00:01:50.230 ', 'John'], ['2018-10-01 00:01:52.560', 'Jill'], ['2018-10-01 00:04:19.100', 'Jack']]
df2 = pd.DataFrame(timestamp2, columns =['timestamp', 'name'])
我想基于时间戳(t)合并两个帧,其中t来自df1(t)> = df2(t)。我正在寻找的输出是:
timestamp_df1, timestamp_df2, name
2018-10-01 00:01:49.800000000 2018-10-01 00:01:50.230 John
2018-10-01 00:01:52.900000000 2018-10-01 00:01:52.56 Jill
2018-10-01 00:04:18.857741600 2018-10-01 00:04:19.100 Jack
时间戳是我可以合并的数据帧中唯一常见的东西。我已经研究了条件合并,但似乎这不是正确的路线?任何帮助或建议都将是有用的!
答案 0 :(得分:2)
您似乎需要在这里pd.merge_asof
。请注意,尽管在第二种情况下条件不成立。您也可以像在这里一样添加公差:
pd.merge_asof(df1, df2,
on='timestamp',
direction='nearest',
tolerance=pd.Timedelta('1min'))
timestamp name
0 2018-10-01 00:01:49.800000000 John
1 2018-10-01 00:01:52.900000000 Jill
2 2018-10-01 00:04:18.857741600 Jack