R函数合并行并引入新的合并变量

时间:2015-01-28 05:48:57

标签: r

我有这样的数据集......

ID        Brand
---      --------
1         Cokacola
2         Pepsi
3         merge with 1
4         merge with 2
5         merge with 1
6         Fanta

我想编写一个R函数,它合并行并根据ID引入新变量,如下所示......

ID          Brand       merge
----       --------   --------
1          Cokacola      1,3,5  
2          Pepsi         2,4   
6          Fanta         6

2 个答案:

答案 0 :(得分:2)

您的数据:

dat <- data.frame(
    id = 1:6,
    brand = c('Cokacola', 'Pepsi', 'merge with 1', 'merge with 2', 'merge with 1', 'Fanta'))

不太优雅但功能强大的代码:

repeats <- grepl('^merge with', dat$brand)
groups <- ifelse(repeats, gsub('merge with ', '', dat$brand), dat$id)
merge <- sapply(unique(groups), function(x) paste(dat$id[groups==x], collapse=','))
dat <- dat[!repeats,]
dat$merge <- merge
dat
##   id    brand merge
## 1  1 Cokacola 1,3,5
## 2  2    Pepsi   2,4
## 6  6    Fanta     6

根据数据的一致性和组成,有很多方法可以使这更加优雅。

答案 1 :(得分:1)

你可以尝试

library(reshape2)
indx <- !grepl('merge', df$Brand)
df1 <- df[indx,]
val <- as.numeric(sub('[^0-9]+', '', df[!indx, 'Brand']))
ml <- melt(tapply(which(!indx), val, FUN=toString))
df2 <- merge(df1, ml, by.x='ID', by.y='Var1', all=TRUE)
df2$merge <- with(df2, ifelse(!is.na(value),
               paste(ID, value, sep=', '), ID))
df2[-3]
#   ID    Brand   merge
#1  1 Cokacola 1, 3, 5
#2  2    Pepsi    2, 4
#3  6    Fanta       6