解析R数据帧中的事件序列

时间:2016-04-20 19:35:04

标签: r parsing sequence

我有以下鼠标操作数据框:

Timestamp1, Timestamp2, button, state, x, y
49.8709998131,49.7949999999,NoButton,Move,498,580
49.8709998131,49.8730000001,Left,Pressed,498,580
49.9659998417,49.983,Left,Released,498,580
50.1739997864,50.1850000001,NoButton,Move,497,580
50.7269999981,50.7310000001,NoButton,Move,495,581
51.8219997883,51.7140000002,NoButton,Move,569,617
51.8229999542,51.8390000002,NoButton,Move,633,642
52.0539999008,51.8390000002,NoButton,Move,654,650
52.0539999008,51.9329999997,NoButton,Move,719,666
52.0539999008,52.057,NoButton,Move,761,666
52.1819999218,52.1979999999,NoButton,Move,763,663
52.5659999847,52.5720000002,NoButton,Move,763,659
52.6779999733,52.6809999999,NoButton,Move,778,658
52.893999815,52.6809999999,NoButton,Move,783,656
52.8949999809,52.8999999999,NoButton,Move,799,650
53.0549998283,53.0559999999,NoButton,Move,800,649
53.2349998951,53.2429999998,NoButton,Move,805,645
53.2349998951,53.2429999998,Left,Pressed,805,645
53.3509998322,53.2590000001,NoButton,Drag,807,644
53.3509998322,53.352,Left,Released,807,644
53.8619999886,53.8670000001,NoButton,Move,808,644
53.9739999771,53.9759999998,NoButton,Move,809,645
54.0779998302,54.085,NoButton,Move,802,686
54.1899998188,54.085,NoButton,Move,802,691
54.1909999847,54.1949999998,NoButton,Move,796,728
54.3019998074,54.304,NoButton,Move,795,745
54.4069998264,54.4130000002,NoButton,Move,796,756
54.5629999638,54.5529999998,NoButton,Move,801,766
54.751999855,54.7250000001,NoButton,Move,803,766
54.8379998207,54.8500000001,NoButton,Move,807,766
54.8389999866,54.8500000001,Left,Pressed,807,766
54.9709999561,54.9750000001,NoButton,Drag,808,766
54.9709999561,54.9750000001,Left,Released,808,766
55.3819999695,55.3960000002,NoButton,Move,809,766
55.5979998112,55.4890000001,NoButton,Move,801,760
55.5989999771,55.6140000001,NoButton,Move,790,752

我想解析特定的子序列,例如左键单击:

49.8709998131,49.8730000001,Left,Pressed,498,580
49.9659998417,49.983,Left,Released,498,580

或拖放,如:

53.2349998951,53.2429999998,Left,Pressed,805,645
53.3509998322,53.2590000001,NoButton,Drag,807,644
53.3509998322,53.352,Left,Released,807,644

或纯鼠标移动不受点击的影响,如:

52.5659999847,52.5720000002,NoButton,Move,763,659
52.6779999733,52.6809999999,NoButton,Move,778,658
52.893999815,52.6809999999,NoButton,Move,783,656
52.8949999809,52.8999999999,NoButton,Move,799,650
53.0549998283,53.0559999999,NoButton,Move,800,649
53.2349998951,53.2429999998,NoButton,Move,805,645

我的启发式方法是使用for循环迭代整个序列,并根据所需的子序列的细节检查实际元素之前和之后的每个元素。这一方面看起来非常费力,另一方面也不适合R优雅的应用式短解决方案。任何人都可以建议更专业的方式吗?

我想我会以更一般的方式重申这个问题。给出以下数据框:

Timestamp, State1, x, y
50.1739997864,a,497,580
50.7269999981,a,495,581
51.8219997883,a,569,617
51.8229999542,b,633,642
52.0539999008,b,654,650
52.0539999008,a,719,666
52.0539999008,a,761,666
52.1819999218,b,763,663
52.5659999847,c,763,659
52.6779999733,b,778,658
52.893999815,a,783,656
52.8949999809,a,799,650
53.0549998283,b,800,649
53.2349998951,a,805,645
53.2349998951,b,805,645
53.3509998322,b,807,644

如何才能得到以下问题的答案: - 哪些子集具有连续的State1 ==" a"行? - 哪些是具有开始和结束行的子集,其中State1 ==" a" 至少有一行有State1!=" a"它们之间? - 两个相邻行的经过时间/欧几里德距离是多少,在第一种情况下,State1!=" a"在第二种情况下,State1 ==" b"?

1 个答案:

答案 0 :(得分:0)

循环可行,但可能会很慢。有一种更快的方法,例如

df$state[-nrow(df)] == "Left,Pressed" &  df$state[-1] == "Left,Released"

这使得一个逻辑向量显示" left,press"紧接着是"左,释放"

类似的代码适用于拖放。对于鼠标移动,rle()可能很有用。