融化r中的data.frame并将未使用的列存储在列中(以逗号分隔)

时间:2018-09-25 13:07:54

标签: r dataframe reshape2 dcast

旧问题,有关新问题,请参见下文

我有一个data.frame

df<-data.frame("name"  = c("A","A","B","C"), 
               "class" = c("ab","cd","cd","ef"),
               "type"  = c("alpha","beta","gamma","delta"))

> df
  name class  type
1    A    ab alpha
3    A    ab  beta
4    B    cd gamma
5    C    ef delta

所以名称A的类型为alphabeta,并且都以

出现

我希望我的数据框看起来像这样(type列可能包含一个用逗号分隔的长字符串):

> df
  name class  type
1    A    ab alpha, beta
2    B    cd gamma
3    C    ef delta

不起作用的是 dcast(df, name~type)

有什么建议吗?

新问题

我希望name是决定性的选择者。因此,A具有类型ab的类alpha和类型cdalpha的类beta

df<-data.frame("name"  = c("A","A","A","B","C"), 
               "class" = c("ab","cd","cd","cd","ef"),
               "type"  = c("alpha","alpha","beta","gamma","delta"))

> df
  name class  type
1    A    ab alpha
2    A    cd alpha
3    A    cd  beta
4    B    cd gamma
5    C    ef delta

dplyr :: summarise(var = paste(type,塌陷=“,”))`(请参见下面的解决方案)返回

> df
  name var
1    A alpha, alpha, beta
2    B gamma
3    C delta

这将在第一行中产生一个双alpha。我正在寻找一种可能性来删除这个双合。目标:

> df
  name var
1    A alpha, beta
2    B gamma
3    C delta

编辑:

由Gregor解决,请参阅评论

1 个答案:

答案 0 :(得分:3)

尝试一下。我们按名称和类别分组,然后用逗号折叠:

library(dplyr)

df %>%
  group_by(name, class) %>%
  summarise(type = paste(type, collapse = ","))
#> # A tibble: 3 x 3
#> # Groups:   name [?]
#>   name  class type      
#>   <fct> <fct> <chr>     
#> 1 A     ab    alpha,beta
#> 2 B     cd    gamma     
#> 3 C     ef    delta

reprex package(v0.2.0)于2018-09-25创建。