我有两个不同的数据框A和B.
A表功能
ID Total Sum
B表为每个ID和各种属性提供了多个条目
ID Attribute 1 Attribute 2 Attribute 3
我想加入A到B只返回A表上我的总列旁边的B行中的一行,但是当这样做时,我最终会在表A上重复ID值。
我在dplyr备忘单上尝试了所有连接,但我无法正常工作。
可重复的示例和所需的输出
表A
TableA<-data.frame(ID=c("KM001","KM002","KM003"))
TableB<-data.frame(ID=c("KM001","KM002","KM003","KM002","KM003","KM002","KM003"),score=c("100", "20", "10", "20", "10", "20", "10"), tieColor=c("blue", "red", "blue", "orange", "purple", "black", "pink"),rainyDay=c("yes", "yes", "yes", "no", "no", "no", "no"))
期望输出
Desired<-data.frame(ID=c("KM001","KM002","KM003"),TotalScoreSum=c("100","60","30"),tieColor=c("blue", "red", "blue"),rainyDay=c("yes", "yes", "yes"))
如果您认为excel相当于在表A和B以及TotalScoreSum上执行id的总结,而对于其他两个&#34; tieColor&#34;和&#34; rainyDay&#34;属性vlookup只检索每列的第一个匹配。
答案 0 :(得分:0)
谢谢你的问题是聚合问题。 docendo discimus解决方案工作正常。
TableB %>% group_by(ID) %>% summarise(score = sum(as.numeric(as.character(score))), tieColor = first(tieColor), rainyDay = first(rainyDay))