Question

如何在按时间顺序排列的群组中进行搜索，并计算某个事件是否发生在另一个事件之前？

我的数据显示为：

id   | flow_nme | prod | RowFilter |
'20' | A2       | 1    | 1         |
'20' | A3       | 1    | 2         |
'30' | A3       | 1    | 1         |
'30  | A2       | 1    | 2         |
'40' | C1       | 1    | 1         |
'40' | C2       | 1    | 2         |
'40' | A3       | 1    | 3         |
'40' | A2       | 1    | 4         |

RowFilter包含每个id组的按时间顺序排序。名称A2，A3，C1和C2没有真正含义，并且是虚拟名称。 RowFilter是根据时间戳创建的，表示事件发生的顺序。唯一确定事件的顺序＆＃34;应该＆＃34;已经发生的是正常的流程。实质上，我想计算没有发生正常流程的时间。

所以我想计算在A3之前A2和C2之前A3发生的每个id的实例。

我的预期输出应该是这样的：

type  | count |
A3-A2 | 2     |
C2-A3 | 1     |

我尝试过使用OVER和PARTITION BY，但一定是做错了。

我会尽快提出代码来模拟数据，因为我有时间让这个问题很容易重现。我比使用SQL更熟悉R，因此我不会立即明白该代码的外观。

Answer 1

在SQL Server 2012+中（适用于lead()和concat()）。

使用common table expression（派生表也可以）与lead()一起找next_flow_nme和concat()来连接flow_nme的{{1}} 1}}，过滤到type。然后where flow_nme > next_flow_nme和group by。

count(*)

rextester演示：http://rextester.com/NHW71132

返回：

;with cte as (
select id, flow_nme, prod, rowfilter
  , next_flow_nme = lead(flow_nme) over (order by id, rowfilter)
from t
)
select 
    type = concat(flow_nme,'-',next_flow_nme)
  , [count]=count(*)
from cte
where flow_nme > next_flow_nme
group by concat(flow_nme,'-',next_flow_nme)

Answer 2

以下是一种方法，假设您为每个flow_nme设置了唯一的id.。这会将值放在不同的列中：

select sum(case when rf_a3 < rf_a2 then 1 else 0 end) as a3_a2,
       sum(case when rf_c2 < rf_a3 then 1 else 0 end) as c2_a3
from (select id,
             min(case when flow_nme = 'A3' then row_filter end) as rf_a3,
             min(case when flow_nme = 'A2' then row_filter end) as rf_a2,
             min(case when flow_nme = 'C2' then row_filter end) as rf_c2
      from t
      group by id
     ) t;

如果您愿意，可以将其取消。我喜欢使用apply来实现此目的：

select v.*
from (select sum(case when rf_a3 < rf_a2 then 1 else 0 end) as a3_a2,
             sum(case when rf_c2 < rf_a3 then 1 else 0 end) as c2_a3
      from (select id,
                   min(case when flow_nme = 'A3' then row_filter end) as rf_a3,
                   min(case when flow_nme = 'A2' then row_filter end) as rf_a2,
                   min(case when flow_nme = 'C2' then row_filter end) as rf_c2
            from t
            group by id
           ) t
     ) t cross apply
     (values ('a3_a2', a3_a2), ('c2_a3', c2_a3)) v(type, count);

计算错误顺序发生的事件数

2 个答案: