我正在尝试合并2个数据框并且水平追加多个匹配项:
数据集1:
id
1 email1
1 email1b
2 email2
3 email3
dataset2:
id name
1 bob
2 rob
3 kat
我想使用merge在id上组合这些数据帧。当存在与id 1重复的匹配时,我希望通过" id"进行合并。水平返回两个结果:
id name email
1 bob email1 email1b
2 rob email2
3 kat email3
似乎合并无法做到这一点,它会为重复值创建多行。还有其他想法吗?
答案 0 :(得分:0)
您可以预先聚合dataset1,如下所示:
dataset1 <- read.table(header = TRUE, text = "
id email
1 email1
1 email1b
2 email2
3 email3")
dataset2 <- read.table(header = TRUE, text = "
id name
1 bob
2 rob
3 kat")
dataset1 <- with(dataset1, aggregate(x = email, by = list(id = id), FUN = paste, collapse = " "))
merge(x = dataset1, y = dataset2, by = "id")[, c(1, 3, 2)]
# id name x
# 1 1 bob email1 email1b
# 2 2 rob email2
# 3 3 kat email3
答案 1 :(得分:0)
dataset1 <- aggregate(email ~ id, dataset1, paste, collapse = " ")
merge(dataset2, dataset1, by = "id")
# id name email
# 1 1 bob email1 email1b
# 2 2 rob email2
# 3 3 kat email3
如果您通过快速聚合和大数据集合并获得一些乐趣,那么data.table
方法
library(data.table)
setkey(dataset1 <- setDT(dataset1)[, list(email = paste(email, collapse = " ")), by = id], id)
setkey(setDT(dataset2), id)
dataset2[dataset1]
## id name email
## 1: 1 bob email1 email1b
## 2: 2 rob email2
## 3: 3 kat email3