在R中的二进制向量中具有0个分隔符的1的块的累积计数

时间:2016-02-05 15:06:46

标签: r dataframe cumulative-frequency

我有一个带有二进制向量的数据框,我想要累计计数。不过,我想算上1' '而不是每个人1并创建此计数的新向量,同时保留0个分隔值。 即

df1 <- data.frame(c(0,1,1,1,1,0,0,0,1,1,1,1,1,0,0,0,1,1,1)

n   bin
1    0
2    1
3    1
4    1
5    1
6    0
7    0
8    0
9    1
10   1
11   1
12   1
13   1
14   0
15   0
16   0
17   1
18   1
19   1 

变为

n   bin cumul
1    0     0
2    1     1
3    1     1
4    1     1
5    1     1
6    0     0
7    0     0
8    0     0
9    1     2
10   1     2
11   1     2
12   1     2
13   1     2
14   0     0
15   0     0
16   0     0
17   1     3
18   1     3
19   1     3

我该怎么做?

3 个答案:

答案 0 :(得分:3)

您可以使用package data.table中的rleid函数:

df1 <- data.frame(bin = c(0,1,1,1,1,0,0,0,1,1,1,1,1,0,0,0,1,1,1))
library(data.table)
setDT(df1)
df1[, cumul := rleid(bin)]
df1[bin == 0, cumul := 0]                  
df1[bin == 1, cumul := rleid(cumul)]  
#    bin cumul
# 1:   0     0
# 2:   1     1
# 3:   1     1
# 4:   1     1
# 5:   1     1
# 6:   0     0
# 7:   0     0
# 8:   0     0
# 9:   1     2
#10:   1     2
#11:   1     2
#12:   1     2
#13:   1     2
#14:   0     0
#15:   0     0
#16:   0     0
#17:   1     3
#18:   1     3
#19:   1     3

答案 1 :(得分:2)

虽然以某种方式手动:

l <- rle(df1$c1)$lengths
v <- rle(df1$c1)$values
v2 <-  cumsum(v)
v2[duplicated(v2)] <- 0

df1$cumul <- rep(v2, times = l)
df1
   c1 cumul
1   0     0
2   1     1
3   1     1
4   1     1
5   1     1
6   0     0
7   0     0
8   0     0
9   1     2
10  1     2
11  1     2
12  1     2
13  1     2
14  0     0
15  0     0
16  0     0
17  1     3
18  1     3
19  1     3

答案 2 :(得分:1)

又一个

x<-c(0,1,1,1,1,0,0,0,1,1,1,1,1,0,0,0,1,1,1)
d<-cumsum(diff(c(0,x))>0)
d[x==0]<-0
cbind(x,d)
      x d
 [1,] 0 0
 [2,] 1 1
 [3,] 1 1
 [4,] 1 1
 [5,] 1 1
 [6,] 0 0
 [7,] 0 0
 [8,] 0 0
 [9,] 1 2
[10,] 1 2
[11,] 1 2
[12,] 1 2
[13,] 1 2
[14,] 0 0
[15,] 0 0
[16,] 0 0
[17,] 1 3
[18,] 1 3
[19,] 1 3