Question

当由另一列分组时，用于串联一列字符串（用“，”分隔）。原始数据示例：

Column1 Column2
1       a
1       b
1       c
1       d
2       e
2       f
2       g
2       h
3       i
3       j
3       k
3       l

Results Needed:
Column1   Grouped_Value
1         "a,b,c,d"
2         "e,f,g,h"
3         "i,j,k,l"

我尝试使用dplyr，但结果似乎是在获取以下内容

Column1   Grouped_Value
1         "a,b,c,d,e,f,g,h,i,j,k,l"
2         "a,b,c,d,e,f,g,h,i,j,k,l"
3         "a,b,c,d,e,f,g,h,i,j,k,l"

summ_data <- 
  df_columns %>%
  group_by(df_columns$Column1) %>%
  summarise(Grouped_Value = paste(df_columns$Column2, collapse =","))

Answer 1

我们可以使用aggregate

aggregate(Column2 ~ Column1, df1, toString)

或与dplyr

library(dplyr)
df1 %>%
   group_by(Column1) %>%
   summarise(Grouped_value =toString(Column2))
# A tibble: 3 x 2
#  Column1 Grouped_value
#    <int> <chr>        
#1       1 a, b, c, d   
#2       2 e, f, g, h   
#3       3 i, j, k, l

注意：toString是paste(., collapse=', ')的包装

OP'解决方案中的问题是它paste整列（df1$Column2或df1[['Column2']]-破坏分组并选择整列）而不是分组的元素< / p>

数据

df1 <- structure(list(Column1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L), Column2 = c("a", "b", "c", "d", "e", "f", "g", "h", 
"i", "j", "k", "l")), class = "data.frame", row.names = c(NA, 
-12L))

Answer 2

dplyr的第一条诫命

请勿在dplyr命令中使用美元符号！

使用

group_by(Column1)

和

summarise(Grouped_Value = paste(Column2, collapse =","))

按组连接

2 个答案:

数据