如果满足R中的某些条件,则将列名称作为值

时间:2018-04-19 19:44:08

标签: r

假设我有四个观察数据集,如下所示:

data <- data.frame(obs = c("a", "b", "c", "d"), 
                   test1 = c(5, 3, 2, 99), 
                   test2 = c(2, 3, 99, 2),
                   test3 = c(4, 2, 5, 99))

有没有办法创建一个名为&#39;缺少&#39;那将显示该行中的值为99的列名?我想要的结果示例:

data_result <- data.frame(obs = c("a", "b", "c", "d"), 
                          test1 = c(5, 3, 2, 99), 
                          test2 = c(2, 3, 99, 2),
                          test3 = c(4, 2, 5, 99), 
                          miss = c(NA, NA, "test2", "test1, test3" ))
> data_result
  obs test1 test2 test3         miss
1   a     5     2     4         <NA>
2   b     3     3     2         <NA>
3   c     2    99     5        test2
4   d    99     2    99 test1, test3

2 个答案:

答案 0 :(得分:2)

使用R基函数:

data$miss <- sapply(split(data, data$obs), function(x) {
   x <- paste0(names(x)[x==99], collapse=",");
   x[x==""]<-NA;
   x
  })
> data
  obs test1 test2 test3        miss
1   a     5     2     4         <NA>
2   b     3     3     2         <NA>
3   c     2    99     5       test2
4   d    99     2    99 test1,test3

答案 1 :(得分:1)

with datatable:

  library(data.table)
  setDT(data)[,missing:=paste(names(data)[-1][.SD==99],collapse=","),by=obs][]
    obs test1 test2 test3     missing
1:   a     5     2     4            
2:   b     3     3     2            
3:   c     2    99     5       test2
4:   d    99     2    99 test1,test3