R:匹配不同数据帧中的两列,输出倍数

时间:2016-02-25 01:42:14

标签: r dataframe

我试图匹配两个不同数据帧中两列的相应值。对于pat.id中的每个子年df1 - 明智对(例如14X-1991),我想搜索df2以创建列表/向量/等所有df2$pat.id匹配组合(例如上面的示例,US18和US20)。

作为样本:

DF1:

pat.id subc year
US1    14X  1991
US3    15R  1992
US5    10R  1990

DF2:

pat.id subc year
US18   14X  1991
US20   14X  1991
US33   15R  1992
US34   15R  1992
US37   15R  1992
US50   10R  1990

数据:

df1 <- data.frame(cbind(c("US1", "US3", "US5"), c("14X", "15R", "10R"), c("1991", "1992", "1990"))) colnames(df1) <- c("pat.id", "subc", "year") df2 <- data.frame(cbind(c("US18", "US20", "US33", "US34", "US37", "US50"), c("14X", "14X", "15R", "15R", "15R", "10R"), c("1991", "1991", "1992", "1992", "1992", "1990"))) colnames(df2) <- c("pat.id", "subc", "year")

插入具体的价值观,它对我有用 df2$pat.id[which(df2$year==1991 & df2$subc=="14X")]。现在,我想循环遍历df1中的所有行。

谢谢!

1 个答案:

答案 0 :(得分:2)

据我所知,这只是一个merge操作:

vars <- c("subc","year")
merge(df1[vars], df2[c(vars,"pat.id")], by=vars)

#  subc year pat.id
#1  10R 1990   US50
#2  14X 1991   US18
#3  14X 1991   US20
#4  15R 1992   US33
#5  15R 1992   US34
#6  15R 1992   US37

如果您只想在合并之前从sample随机选择一行df2

merge(
 df1[vars],
 aggregate(pat.id ~ ., data=df2[c("pat.id",vars)], FUN=sample, 1), by=vars
)
#  subc year pat.id
#1  14X 1991   US20
#2  15R 1992   US33
#3  10R 1990   US50