填充按ID分组的间隔之间的值

时间:2019-09-03 12:24:56

标签: r

我有一个数据集,其中对象在不同时间的值为1或0。我需要一个带有1的函数或一段代码,第一个和最后一个1之间的值为0。

我尝试了complete()和fill(),但没有做我想做的事

我有以下数据:

dat = tibble(ID = c(1,1,1,1,1,1,1,1,1,1,
                    2,2,2,2,2,2,2,2,2,2,
                    3,3,3,3,3,3,3,3,3,3),
             TIME = c(1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10),
             DV = c(0,0,1,1,0,0,1,0,0,0,
                    0,1,0,0,0,0,0,0,0,1,
                    0,1,0,1,0,1,0,1,0,0))

# A tibble: 30 x 3
      ID  TIME    DV
   <dbl> <dbl> <dbl>
 1     1     1     0
 2     1     2     0
 3     1     3     1
 4     1     4     1
 5     1     5     0
 6     1     6     0
 7     1     7     1
 8     1     8     0
 9     1     9     0
10     1    10     0
# ... with 20 more rows

我需要以下输出,如DV2所示:

dat = tibble(ID = c(1,1,1,1,1,1,1,1,1,1,
                    2,2,2,2,2,2,2,2,2,2,
                    3,3,3,3,3,3,3,3,3,3),
             TIME = c(1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10),
             DV = c(0,0,1,1,0,0,1,0,0,0,
                    0,1,0,0,0,0,0,0,0,1,
                    0,1,0,1,0,1,0,1,0,0),
             DV2 = c(0,0,1,1,1,1,1,0,0,0,
                    0,1,1,1,1,1,1,1,1,1,
                    0,1,1,1,1,1,1,1,0,0))

# A tibble: 30 x 4
      ID  TIME    DV   DV2
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     0     0
 2     1     2     0     0
 3     1     3     1     1
 4     1     4     1     1
 5     1     5     0     1
 6     1     6     0     1
 7     1     7     1     1
 8     1     8     0     0
 9     1     9     0     0
10     1    10     0     0
# ... with 20 more rows

2 个答案:

答案 0 :(得分:2)

使用dplyr,您可以执行以下操作:

dat %>%
 rowid_to_column() %>%
 group_by(ID) %>%
 mutate(DV2 = if_else(rowid %in% min(rowid[DV == 1]):max(rowid[DV == 1]),
                      1, 0)) %>%
 ungroup() %>%
 select(-rowid)

      ID  TIME    DV   DV2
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     0     0
 2     1     2     0     0
 3     1     3     1     1
 4     1     4     1     1
 5     1     5     0     1
 6     1     6     0     1
 7     1     7     1     1
 8     1     8     0     0
 9     1     9     0     0
10     1    10     0     0

答案 1 :(得分:1)

我们可以创建一个辅助函数,并将其应用于每个组,即

f1 <- function(x) {
    v1 <- which(x == 1)
    x[v1[1]:v1[length(v1)]] <- 1
    return(x)
}

with(dat, ave(DV, ID, FUN = f1))
#[1] 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0