我试图计算一组列中的一个或两个值发生变化时的时差。
我研究了滚动功能,但认为窗口大小是固定的。然后我发现了diff函数,它可以帮助检测开关。然后,我可以使用它来计算开始标记和结束标记,并以蛮力的方式进行操作,如下所示。
Records Time Field1 Field2 Comments
1 1 a 1
2 2 a 1
3 4 a 2 Switch
4 5 b 1 switch
5 10 b 1
6 12 b 1
7 15 b 3 switch
8 20 c 4 switch
9 21 c 4
10 22 c 4
这是计算的下一步:
Field1 Field2 Result Computation Comments
a 1 1 diff of time for records 2 and 1
a 2 0 0 since only one record
b 1 7 diff of time for records 6 and 4
b 3 0 0 since only one record
c 4 2 diff of time for records 10 and 8
下面是一个代码片段,用于获取具有上述原始数据的数据框:
data = [
(1,1,"a",1),
(2,2,"a",1),
(3,4,"a",2),
(4,5,"b",1),
(5,10,"b",1),
(6,12,"b",1),
(7,15,"b",3),
(8,20,"c",4),
(9,21,"c",4),
(10,22,"c",4)
]
labels = ["Time", "Field1", "Field2", "Value"]
pd.DataFrame.from_records(data, columns=labels)