R列的第一个非NA值

时间:2020-08-11 16:22:22

标签: r data.table coalesce

df <- data.frame(ID=c(1,2,3,4,5,6),
                CO=c(-6,4,2,3,0,2),
                CATFOX=c(1,NA,NA,3,0,NA),
                DOGFOX=c(NA,NA,5,1,2,NA),
                RABFOX=c(NA,3,NA,5,3,NA),
                D=c(0,4,5,6,1,2),
                WANT=c(1,3,5,3,0,NA))

我有一个数据框,我希望使列WANT的第一个值'CATFOX''DOGFOX''RABFOX'不是NA。是否有data.table解决方案?我尝试了一下,但是没有产生预期的结果:

df$WANT=do.call(coalesce, data[grepl('FOX',names(data))])

3 个答案:

答案 0 :(得分:2)

您的示例中有coalesce,它是dplyr的构造。尝试fcoalesce

library(data.table)

setDT(df)[, WANT2 := fcoalesce(CATFOX, DOGFOX, RABFOX)]

输出:

   ID CO CATFOX DOGFOX RABFOX D WANT WANT2
1:  1 -6      1     NA     NA 0    1     1
2:  2  4     NA     NA      3 4    3     3
3:  3  2     NA      5     NA 5    5     5
4:  4  3      3      1      5 6    3     3
5:  5  0      0      2      3 1    0     0
6:  6  2     NA     NA     NA 2   NA    NA

答案 1 :(得分:1)

您可以尝试以下base R解决方案:

#Data
data=data.frame(ID=c(1,2,3,4,5),
                CO=c(-6,4,2,3,0),
                CATFOX=c(1,NA,NA,3,0),
                DOGFOX=c(NA,NA,5,1,2),
                RABFOX=c(NA,3,NA,5,3),
                D=c(0,4,5,6,1),
                WANT=c(1,3,5,3,0))
#Process
index <- which(names(data) %in% c('CATFOX','DOGFOX','RABFOX'))
data$WANT2 <- apply(data[,index],1,function(x) x[min(which(!is.na(x)))])

输出:

  ID CO CATFOX DOGFOX RABFOX D WANT WANT2
1  1 -6      1     NA     NA 0    1     1
2  2  4     NA     NA      3 4    3     3
3  3  2     NA      5     NA 5    5     5
4  4  3      3      1      5 6    3     3
5  5  0      0      2      3 1    0     0

答案 2 :(得分:1)

我们可以在base R中使用向量化选项

i1 <- endsWith(names(df), 'FOX')
df$WANT2 <-  df[i1][cbind(seq_len(nrow(df)), max.col(!is.na(df[i1]), 'first'))]
df$WAN2
#[1]  1  3  5  3  0 NA