我有两个数据帧:df1
和df2
:
> df1
ID Gender age cd evnt scr test_dt
1 C0004 MALE 22 1 1 82 7/3/2014
2 C0004 MALE 22 1 2 76 7/3/2014
3 C0005 MALE 22 1 3 1514 7/3/2014
4 C0005 MALE 23 2 1 81 11/3/2014
5 C0006 MALE 23 2 2 75 11/3/2014
6 C0006 MALE 23 2 3 878 11/3/2014
和
> df2
ID hgt wt phys_dt
1 C0004 70 147 6/29/2015
2 C0004 70 157 6/27/2016
3 C0005 67 175 6/27/2016
4 C0005 65 171 7/2/2014
5 C0006 69 160 6/29/2015
6 C0006 64 143 7/2/2014
我想以产生以下数据帧的方式加入df1
和df2
,将其称为df3
:
> df3
ID Gender age cd evnt scr hgt wt
1 C0004 MALE 22 1 1 82 70 147
2 C0004 MALE 22 1 2 76 70 157
3 C0005 MALE 22 1 3 1514 67 175
4 C0005 MALE 23 2 1 81 65 171
5 C0006 MALE 23 2 2 75 69 160
6 C0006 MALE 23 2 3 878 64 143
我正在尝试将df2$hgt
和df2$wt
添加到正确的ID
行中。棘手的部分是我想将hgt
和wt
加入日期(ID
和df1$test_dt
)最接近的df2$phys_dt
行中。我以为我可以先按ID
对两个数据框进行排序,然后按它们各自的日期排序,然后尝试加入?我不太确定该如何处理。谢谢。
答案 0 :(得分:0)
如果您只想匹配df1 $ ID和df2 $ ID,则应该执行以下操作:
df3 <- left_join(df1, df2, by = c("ID" = "ID"))
如果日期和ID应该匹配,则可以尝试:
df3 <- left_join(df1, df2, by = c("ID" = "ID", "test_dt" = "phys_dt"))
它在库(dplyr)中