我有一列数据(45,000行!),它表示过滤器是何时运行或关闭(取决于条件,显示为零)。日志文件记录如下(实际上是两列,第一列是自启动以来的计数)
col 1: 1,2,3,4,5,6,7,8,9,10.....45,000)
col 2: 1,2,3,4,0,0,0,0,1,2,3,4,5,6,7,8,9,10,11,12,0,0,0,0,0,0,0,0,1,2,3,0,0,0,0,0 etc.
我想要的是2列(“时间打开”和“时间关闭”),上面的数据是:
"time on" 4,0,12,0,3
(即按非零序列顺序排列的长度)和相应的
"time off" 0,4,0,8,5
(顺序为零的长度)。
最终,我想制作一个条形图,显示出工作日数,休息日数。
答案 0 :(得分:2)
您可以使用rle
(“游程长度编码”)来实现此目的:
x = c(1,2,3,4,0,0,0,0,1,2,3,4,5,6,7,8,9,10,11,12,0,0,
0,0,0,0,0,0,1,2,3,0,0,0,0,0)
runs = rle(x != 0)
nonzero = runs$lengths
nonzero[! runs$values] = 0
nonzero
# Output:
# [1] 4 0 12 0 3 0
zeros = runs$lengths
zeros[runs$values] = 0
zeros
# Output:
# [1] 0 4 0 8 0 5