得到序列的长度

时间:2018-12-17 22:23:26

标签: r sequence run-length-encoding

我有一列数据(45,000行!),它表示过滤器是何时运行或关闭(取决于条件,显示为零)。日志文件记录如下(实际上是两列,第一列是自启动以来的计数)

col 1: 1,2,3,4,5,6,7,8,9,10.....45,000)
col 2: 1,2,3,4,0,0,0,0,1,2,3,4,5,6,7,8,9,10,11,12,0,0,0,0,0,0,0,0,1,2,3,0,0,0,0,0 etc.

我想要的是2列(“时间打开”和“时间关闭”),上面的数据是:

"time on" 4,0,12,0,3

(即按非零序列顺序排列的长度)和相应的

"time off" 0,4,0,8,5 

(顺序为零的长度)。

最终,我想制作一个条形图,显示出工作日数,休息日数。

1 个答案:

答案 0 :(得分:2)

您可以使用rle(“游程长度编码”)来实现此目的:

x = c(1,2,3,4,0,0,0,0,1,2,3,4,5,6,7,8,9,10,11,12,0,0,
      0,0,0,0,0,0,1,2,3,0,0,0,0,0)
runs = rle(x != 0)
nonzero = runs$lengths
nonzero[! runs$values] = 0
nonzero
# Output:
# [1]  4  0 12  0  3  0
zeros = runs$lengths
zeros[runs$values] = 0
zeros
# Output:
# [1] 0 4 0 8 0 5