我有一个简单的时间序列,操作员可以打开和关闭开关。我的目标是标记每个"打开"具有不同ID的阶段,例如,具有列eventID的结果将如下所示:
val eventDF = sc.parallelize(List(("2016-05-01 10:00:00", 0, 0),
("2016-05-01 10:00:30", 0, 0),
("2016-05-01 10:01:00", 1, 1),
("2016-05-01 10:01:20", 1, 1),
("2016-05-01 10:02:10", 1, 1),
("2016-05-01 10:03:30", 0, 0),
("2016-05-01 10:04:00", 0, 0),
("2016-05-01 10:05:20", 0, 0),
("2016-05-01 10:06:10", 1, 2),
("2016-05-01 10:06:30", 1, 2),
("2016-05-01 10:07:00", 1, 2),
("2016-05-01 10:07:20", 0, 0),
("2016-05-01 10:08:10", 0, 0),
("2016-05-01 10:08:50", 0, 0)))
.toDF("timestamp", "switch", "eventID")
到目前为止,我尝试了rank / rangeBetween / lag窗口函数而没有任何运气...因此,任何提示都会受到赞赏。