在连接时将行插入组中

时间:2017-06-19 15:06:18

标签: r dplyr tidyr

我正在尝试加入R中的两个data.tables。我正在按名称加入它们,我想"插入"从一个数据表到另一个数据表的名称组的行。

所以例如: 数据表A有"名称"和"金额",数据表B有"名称"和"地址" (但每个名称不止一个地址)。我想要一个数据表,其中包含每个名称,相应的地址和单个"金额"对于每组名字。

我尝试使用" left_join"在dplyr中,但是每个"地址"的数量列会重复。行。

有人有什么想法吗?谢谢。

示例图片(将表1和2连接到创建3):

甚至是这样:

编辑:添加了一个可重现的示例,说明两个数据集是什么样的,以及所需的输出是什么

table_one <- data.frame(name=c("x","y","z"), amount=c("$100","200","300"))
table_two <- data.frame(name=c("x","x","y","z","z","z"), address=c("A","B","C","D","E","F"))

output <- data.frame(name=c("x","x","y","z","z","z"), 
                     address=c("A","B","C","D","E","F"), amount=c("$100","","$200","$300","",""))

3 个答案:

答案 0 :(得分:1)

使用dplyr

library(dplyr)

left_join(table_two, table_one, by = 'name') %>% 
   mutate(amount = replace(amount, duplicated(name), NA))
#  name address amount
#1    x       A   $100
#2    x       B   <NA>
#3    y       C    200
#4    z       D    300
#5    z       E   <NA>
#6    z       F   <NA>

答案 1 :(得分:0)

你走了。

table_one <- data.frame(name=c("x","y","z"), amount=c("$100","$200","$300"))
table_two <- data.frame(name=c("x","x","y","z","z","z"), address=c("A","B","C","D","E","F"))

output <- data.frame(name=c("x","x","y","z","z","z"), 
                     address=c("A","B","C","D","E","F"), amount=c("$100","","$200","$300","",""))


test <- merge(table_one, table_two, by = 'name')
test$amount <- as.character(test$amount)
test$amount[duplicated(test[,c(1,2)])] <- ""
test

答案 2 :(得分:0)

我们可以使用match

执行此操作
i1 <- with(table_one, match(name, table_two$name))
table_two$amount <- ""
table_two$amount[i1] <- as.character(table_one$amount)
table_two
#   name address amount
#1    x       A   $100
#2    x       B       
#3    y       C    200
#4    z       D    300
#5    z       E       
#6    z       F