R:基于条件将表与两个查找表连接/合并

时间:2017-01-25 20:19:23

标签: r join merge conditional

我有一个人的基础数据集:

everyoneexample <- data.frame(
gender=c("Female", "Male", "Male", "Female"), age=c(18, 18, 20, 21))

> everyoneexample
  gender age
1 Female  18
2   Male  18
3   Male  20
4 Female  21

和两个查找表:

scorefemale <- data.frame(age=c(18, 19, 20, 21, 22, 23), 
  score=c(1.1, 3.3, 5.5, 7.7, 9.9, 11.1))

> scorefemale
  age score
1  18   1.1
2  19   3.3
3  20   5.5
4  21   7.7
5  22   9.9
6  23  11.1

scoremale <- data.frame(age=c(18, 19, 20, 21, 22, 23), 
   score=c(2.2, 4.4, `6.6, 8.8, 10.1, 12.1))`

> scoremale
  age score
1  18   2.2
2  19   4.4
3  20   6.6
4  21   8.8
5  22  10.1
6  23  12.1

我基本上想要得到这个:

    gender  age score
1   Female  18  1.1
2   Male    18  2.2
3   Male    20  6.6
4   Female  21  7.7

我在条件连接/合并上查找的所有内容都假定一个主表和一个引用表,但我的问题需要两个引用表。

希望这个例子很清楚,但如果你想让我澄清任何问题,请随时解答。

UPDATE :感谢Gregor,最优雅的答案似乎只是从两个参考表的rbind中创建一个临时表,然后使用两个&#34; by&#34;变量:

everyoneexample <- merge(scores_FandM, everyoneexample, by=c("age", "gender"))

2 个答案:

答案 0 :(得分:1)

female_rows <- which(everyoneexample$gender == 'Female')
female_matches <- merge(everyoneexample[female_rows, ], scorefemale, by = 'age')

male_rows <- which(everyoneexample$gender == 'Male')
male_matches <- merge(everyoneexample[male_rows, ], scoremale, by = 'age')

everyoneexample$score <- NA
everyoneexample[female_rows, 'score'] <- female_matches$score
everyoneexample[male_rows, 'score'] <- male_matches$score

答案 1 :(得分:0)

感谢@Gregor,他建议在每个查找表中添加性别列:

> scorefemale$gender <- "Female"
> scoremale$gender <- "Male"

然后将表组合起来形成一个大的查找表:

> scores_FandM <- rbind(scorefemale, scoremale)

然后最后使用两个&#34; by&#34;左边加入主表和查找表。变量 - 年龄性别 - 在新的组合查找表中有效地形成复合键

> everyoneexample <- 
      merge(everyoneexample, scores_FandM, by=c('age', 'gender'), all.x = TRUE)

简单而优雅......谢谢!