根据特定值的顺序合并熊猫

时间:2020-04-22 12:53:29

标签: pandas

假设我有两个具有以下值的数据框:

DF1  Name        Time-In
      Person1     2020-04-21 20:32:44
      Person2     2020-04-21 20:37:19
      Person3     2020-04-21 20:44:04
      Person1     2020-04-21 21:17:22
      Person1     2020-04-21 23:00:00


DF2   Name        Time-Out
      Person1     2020-04-21 20:50:11
      Person2     2020-04-21 21:15:15
      Person1     2020-04-21 22:00:59

我想根据名称出现的顺序(DF1上的Person1的第一个Time-In合并到DF2上的Person1的第一个Time-Out)来合并表,对于像Person3这样的NaN实例(在DF2中没有记录) ,并且对于Person1在DF1中具有附加值的情况。决赛桌看起来像这样:

DF3   Name        Time-In                   Time-Out
      Person1     2020-04-21 20:32:44       2020-04-21 20:50:11
      Person2     2020-04-21 20:37:19       2020-04-21 21:15:15
      Person3     2020-04-21 20:44:04       NaN
      Person1     2020-04-21 21:17:22       2020-04-21 22:00:59
      Person1     2020-04-21 23:00:00       NaN

关于如何执行此操作的任何想法?预先感谢。

1 个答案:

答案 0 :(得分:0)

merge_asofdirection='forward'参数一起使用:

df1['Time-In'] = pd.to_datetime(df1['Time-In'])
df2['Time-Out'] = pd.to_datetime(df2['Time-Out'])

df = pd.merge_asof(df1, 
                   df2, 
                   left_on='Time-In',
                   right_on='Time-Out', 
                   by='Name',
                   direction='forward')
print (df)
      Name             Time-In            Time-Out
0  Person1 2020-04-21 20:32:44 2020-04-21 20:50:11
1  Person2 2020-04-21 20:37:19 2020-04-21 21:15:15
2  Person3 2020-04-21 20:44:04                 NaT
3  Person1 2020-04-21 21:17:22 2020-04-21 22:00:59
4  Person1 2020-04-21 23:00:00                 NaT