在数据框中查找某些列重复的行,然后合并其他列中的元素

时间:2019-03-19 13:02:49

标签: r dataframe

我有一个数据框,我想找到A和B列都重复的行,然后通过将C列中的元素组合在一起来合并行。

我的例子:

 DF = cbind.data.frame(A = c(1, 1, 2, 3, 3), 
                       B = c("a", "b", "a", "c", "c"), 
                       C = c("M", "N", "X", "M", "N"))

我的预期结果:

 DFE = cbind.data.frame(A = c(1, 1, 2, 3), 
                        B = c("a", "b", "a", "c"), 
                        C = c("M", "N", "X", "M; N"))

非常感谢

2 个答案:

答案 0 :(得分:3)

没有包:

DF <- aggregate(C ~ A + B, FUN = function(x) paste(x, collapse = "; "), data = DF)

输出:

  A B    C
1 1 a    M
2 2 a    X
3 1 b    N
4 3 c M; N

或使用data.table

setDT(DF)[, .(C = paste(C, collapse = "; ")), by = .(A, B)]

答案 1 :(得分:2)

这是基于tidyverse的解决方案,您可以在对paste进行分组后将其折叠使用。

library(dplyr)
DF = cbind.data.frame(A = c(1, 1, 2, 3, 3), 
                      B = c("a", "b", "a", "c", "c"), 
                      C = c("M", "N", "X", "M", "N"))


DFE = cbind.data.frame(A = c(1, 1, 2, 3), 
                       B = c("a", "b", "a", "c"), 
                       C = c("M", "N", "X", "M; N"))


DF %>% 
  group_by(A,B) %>% 
  summarise(C = paste(C, collapse = ";"))
#> # A tibble: 4 x 3
#> # Groups:   A [3]
#>       A B     C    
#>   <dbl> <fct> <chr>
#> 1     1 a     M    
#> 2     1 b     N    
#> 3     2 a     X    
#> 4     3 c     M;N

reprex package(v0.2.1)于2019-03-19创建