需要一个简单的查询来计算SQL Server中的序列长度

时间:2018-11-13 15:51:28

标签: sql sql-server

我有这个视图,表示每个用户到表内系统的连接状态,如下所示:

---------------------------------------
|id |   date     | User  |  Connexion |
|1  | 01/01/2018 |  A    |      1     |
|2  | 02/01/2018 |  A    |      0     |
|3  | 03/01/2018 |  A    |      1     |
|4  | 04/01/2018 |  A    |      1     |
|5  | 05/01/2018 |  A    |      0     |
|6  | 06/01/2018 |  A    |      0     |
|7  | 07/01/2018 |  A    |      0     |
|8  | 08/01/2018 |  A    |      1     |
|9  | 09/01/2018 |  A    |      1     |
|10 | 10/01/2018 |  A    |      1     |
|11 | 11/01/2018 |  A    |      1     |
---------------------------------------

目标输出将是按日期获取成功和失败连接顺序的计数,因此输出将是这样

---------------------------------------------------------------
|StartDate         EndDate       User     Connexion     Length|
|01/01/2018  |   01/01/2018  |     A    |    1      |      1  |
|02/01/2018  |   02/01/2018  |     A    |    0      |      1  |
|03/01/2018  |   04/01/2018  |     A    |    1      |      2  |
|05/01/2018  |   07/01/2018  |     A    |    0      |      3  |
|08/01/2018  |   11/01/2018  |     A    |    1      |      4  |
---------------------------------------------------------------

1 个答案:

答案 0 :(得分:3)

这就是所谓的“空缺与孤岛”问题。针对您的版本的最佳解决方案是行号的不同:

select user, min(date), max(date), connexion, count(*) as length
from (select t.*,
             row_number() over (partition by user order by date) as seqnum,
             row_number() over (partition by user, connexion order by date) as seqnum_uc
      from t
     ) t
group by user, connexion, (seqnum - seqnum_uc);

为什么这样做有效,所以很难解释。通常,我发现,如果您盯着子查询的结果,就会发现所关注的组之间的差异是如何恒定的。

注意:列名不能使用userdate。这些是SQL中的关键字(一种或另一种类型)。如果确实使用它们,则必须使用转义字符使SQL混乱,这只会使代码更难编写,读取和调试。