假设我有四个观察数据集,如下所示:
data <- data.frame(obs = c("a", "b", "c", "d"),
test1 = c(5, 3, 2, 99),
test2 = c(2, 3, 99, 2),
test3 = c(4, 2, 5, 99))
有没有办法创建一个名为&#39;缺少&#39;那将显示该行中的值为99的列名?我想要的结果示例:
data_result <- data.frame(obs = c("a", "b", "c", "d"),
test1 = c(5, 3, 2, 99),
test2 = c(2, 3, 99, 2),
test3 = c(4, 2, 5, 99),
miss = c(NA, NA, "test2", "test1, test3" ))
> data_result
obs test1 test2 test3 miss
1 a 5 2 4 <NA>
2 b 3 3 2 <NA>
3 c 2 99 5 test2
4 d 99 2 99 test1, test3
答案 0 :(得分:2)
使用R基函数:
data$miss <- sapply(split(data, data$obs), function(x) {
x <- paste0(names(x)[x==99], collapse=",");
x[x==""]<-NA;
x
})
> data
obs test1 test2 test3 miss
1 a 5 2 4 <NA>
2 b 3 3 2 <NA>
3 c 2 99 5 test2
4 d 99 2 99 test1,test3
答案 1 :(得分:1)
with datatable:
library(data.table)
setDT(data)[,missing:=paste(names(data)[-1][.SD==99],collapse=","),by=obs][]
obs test1 test2 test3 missing
1: a 5 2 4
2: b 3 3 2
3: c 2 99 5 test2
4: d 99 2 99 test1,test3