用NA填充直到数字,然后用0填充

时间:2020-02-12 20:21:19

标签: r na

我需要的是在数字出现在计时器中之后将NA转换为0。这是一个示例:

c1 <- c(1,NA,NA,NA,NA,1,2,NA,NA,NA,5,NA,NA)
c2 <- c(2,NA,NA,10,30,NA,NA,NA,NA,4,1,2,NA)
c3 <- c(3,NA,NA,NA,NA,NA,NA,NA,NA,1,NA,NA,NA)
x <- data.frame(rbind(c1,c2,c3))
colnames(x) <- c("ID","Jan01","Feb01","Mar01","Apr01","May01","Jun01","Jul01","Aug01","Sep01","Oct01","Nov01","Dec01")
x

#    ID Jan01 Feb01 Mar01 Apr01 May01 Jun01 Jul01 Aug01 Sep01 Oct01 Nov01 Dec01
# c1  1    NA    NA    NA    NA     1     2    NA    NA    NA     5    NA    NA
# c2  2    NA    NA    10    30    NA    NA    NA    NA     4     1     2    NA
# c3  3    NA    NA    NA    NA    NA    NA    NA    NA     1     NA   NA    NA

这就是我的期望:

c11 <- c(1,NA,NA,NA,NA,1,2,0,0,0,5,0,0)
c22 <- c(2,NA,NA,10,30,0,0,0,0,4,1,2,0)
c33 <- c(3,NA,NA,NA,NA,NA,NA,NA,NA,1,0,0,0)
y <- data.frame(rbind(c11,c22,c33))
colnames(y) <- c("ID","Jan01","Feb01","Mar01","Apr01","May01","Jun01","Jul01","Aug01","Sep01","Oct01","Nov01","Dec01")
y

#     ID Jan01 Feb01 Mar01 Apr01 May01 Jun01 Jul01 Aug01 Sep01 Oct01 Nov01 Dec01
# c11  1    NA    NA    NA    NA     1     2     0     0     0     5     0     0
# c22  2    NA    NA    10    30     0     0     0     0     4     1     2     0
# c33  3    NA    NA    NA    NA    NA    NA    NA    NA     1     0     0     0

有人知道该怎么做吗?谢谢!

3 个答案:

答案 0 :(得分:4)

一个base选项:

t(apply(x[,-1], 1, function(x) ifelse(is.na(x) & cumsum(!is.na(x)) >= 1, 0, x)))

输出:

   Jan01 Feb01 Mar01 Apr01 May01 Jun01 Jul01 Aug01 Sep01 Oct01 Nov01 Dec01
c1    NA    NA    NA    NA     1     2     0     0     0     5     0     0
c2    NA    NA    10    30     0     0     0     0     4     1     2     0
c3    NA    NA    NA    NA    NA    NA    NA    NA     1     0     0     0

正如@markus所指出的,为了提高性能,请使用replace而不是ifelse,例如:

t(apply(x[,-1], 1, function(x) replace(x, is.na(x) & cumsum(!is.na(x)) >= 1, 0)))

答案 1 :(得分:3)

在替换NA以匹配您所需的输出后,我转回到“宽”格式,但是请注意,无论如何,最好以长格式存储它。

library(dplyr)

long <- 
  x %>% 
    pivot_longer(-ID) %>% 
    group_by(ID) %>% 
    mutate(value = ifelse(cummax(!is.na(value)), coalesce(value, 0), value))

long %>% 
  pivot_wider(ID, name)


# # A tibble: 3 x 13
# # Groups:   ID [3]
#      ID Jan01 Feb01 Mar01 Apr01 May01 Jun01 Jul01 Aug01 Sep01 Oct01 Nov01
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1     1    NA    NA    NA    NA     1     2     0     0     0     5     0
# 2     2    NA    NA    10    30     0     0     0     0     4     1     2
# 3     3    NA    NA    NA    NA    NA    NA    NA    NA     1     0     0
# # ... with 1 more variable: Dec01 <dbl>

答案 2 :(得分:0)

另一个基本的R解决方案,使用aggregate + col + replace,即

idx <- aggregate(col~row,which(!is.na(x[-1]),arr.ind = T),min)
xout <- cbind(x[1],replace(x[-1],col(x[-1])>=idx$col & is.na(x[-1]),0))

这样

> xout
   ID Jan01 Feb01 Mar01 Apr01 May01 Jun01 Jul01 Aug01 Sep01 Oct01 Nov01 Dec01
c1  1    NA    NA    NA    NA     1     2     0     0     0     5     0     0
c2  2    NA    NA    10    30     0     0     0     0     4     1     2     0
c3  3    NA    NA    NA    NA    NA    NA    NA    NA     1     0     0     0