创建一个新列,该列采用基于值的列名

时间:2020-08-04 14:43:16

标签: r dplyr

我有一个表,该表具有要汇总为一列的二进制值。这是一个示例:

df <- data.frame("User" = c("User A", "User B", "User C"), "quality 1" = c(0,0,1), "quality 2" = c(1,0,0), "quality 3" = c(0,1,0))

我想运行生成这样的数据帧的函数:

summary <- data.frame("User" = c("User A", "User B", "User C"), "qualityNumber" = c("quality.2", "quality.3", "quality.1") )

对于每一行,从包含1的原始df中为新变量(“ qualityNumber”)分配列名。

我尝试使用dplyr和which(),但我无法弄清楚。

我的尝试:

summary = df %>%
mutate(
    qualityNumber= 
      colnames(df[which(2:4 == 1)])
    )

2 个答案:

答案 0 :(得分:1)

您可以尝试添加到同一df

df$qualityNumber <- apply(df[,-1],1,function(x) names(x)[which(x==1)])

    User quality.1 quality.2 quality.3 qualityNumber
1 User A         0         1         0     quality.2
2 User B         0         0         1     quality.3
3 User C         1         0         0     quality.1

或选择任务后的列:

df[,c(1,5)]

    User qualityNumber
1 User A     quality.2
2 User B     quality.3
3 User C     quality.1

答案 1 :(得分:0)

另一种方法:

df$qualityNumber <- names(df)[max.col(df == 1, ties.method = "first")]

结果:

> df
    User quality.1 quality.2 quality.3 qualityNumber
1 User A         0         1         0     quality.2
2 User B         0         0         1     quality.3
3 User C         1         0         0     quality.1