匹配两列,得到三分之一更长的

时间:2016-08-14 12:56:30

标签: r match

我有两个data.frame s:

df1 <- data.frame(ID = c(1,2,3,4), Birth.date = c("2015-09-16","2015-09-17","2015-09-18","2015-09-19"))

df2 <- data.frame(ID = c(1,1,2,2,3,3,4,4), value = c("a","b","c","d","e","a","b","c"))

目标是向Birth.date添加df2列,以便每行显示ID中每个df1的出生日期。结果看起来像这样:

Goal <- data.frame(ID = c(1,1,2,2,3,3,4,4), value = c(a,b,c,d,e,a,b,c)), Birth.date = c("2015-09-16","2015-09-16","2015-09-17","2015-09-17","2015-09-18","2015-09-18","2015-09-19","2015-09-19"))

我尝试使用match(),但它给出了这个:

df2$Birth.Date <- df1[match(df1$ID, df2$ID),2]

df2

  ID value Birth.Date
1  1     a 2015-09-16
2  1     b 2015-09-18
3  2     c       <NA>
4  2     d       <NA>
5  3     e 2015-09-16
6  3     a 2015-09-18
7  4     b       <NA>
8  4     c       <NA>

现在试图解决一段时间,但无济于事。有什么帮助吗?

2 个答案:

答案 0 :(得分:1)

我们可以使用left_join

library(dplyr)
left_join(df2, df1, by = "ID")
#     ID value Birth.date
#1  1     a 2015-09-16
#2  1     b 2015-09-16
#3  2     c 2015-09-17
#4  2     d 2015-09-17
#5  3     e 2015-09-18
#6  3     a 2015-09-18
#7  4     b 2015-09-19
#8  4     c 2015-09-19

如果我们使用match,则正确的选项是将x作为'df2'中的'ID',将table作为'ID'来自'df1'

df2$Birth.date <- df1$Birth.date[match(df2$ID, df1$ID)]

答案 1 :(得分:1)

使用基础R中的merge

> merge(df2,df1,by.x = 'ID')

  ID value Birth.date
1  1     a 2015-09-16
2  1     b 2015-09-16
3  2     c 2015-09-17
4  2     d 2015-09-17
5  3     e 2015-09-18
6  3     a 2015-09-18
7  4     b 2015-09-19
8  4     c 2015-09-19