SQL Server:为行分配相同的行号,直到满足条件

时间:2018-03-21 21:39:47

标签: sql sql-server tsql row-number

我有以下数据集:

    ukey          id          code                    create_date       
     1           1082         9053             2018-03-01 23:18:51.0000000
     2           1082         9035             2018-03-01 23:19:21.0000000
     3           1082         9053             2018-03-01 23:22:55.0000000
     4           1082         9535             2018-03-01 23:23:30.0000000
     5           1196         3145             2018-03-05 07:27:15.0000000
     6           1196         3162             2018-03-05 07:27:50.0000000
     7           1196         3175             2018-03-05 07:28:24.0000000
     8           1196         3235             2018-03-05 07:28:57.0000000
     9           1196         3295             2018-03-05 07:29:31.0000000
     10          1196         3448             2018-03-05 07:30:04.0000000
     11          1196         3465             2018-03-05 07:30:37.0000000
     12          1196         4265             2018-03-05 07:31:09.0000000
     13          1196         3495             2018-03-05 17:13:13.0000000
     14           473          551             2018-03-02 16:43:52.0000000
     15           473          590             2018-03-02 16:44:30.0000000
     16           473          835             2018-03-02 16:45:02.0000000

我希望此数据集中有另一列,例如RN。在满足以下两个条件之前,RN列应具有相同的值:

  • 如果id不同,或
  • 如果当前行与上一行之间的datediff超过60秒

所以最终的数据集应如下所示:

    ukey          id          code                    create_date                RN
     1           1082         9053             2018-03-01 23:18:51.0000000        1
     2           1082         9035             2018-03-01 23:19:21.0000000        1
     3           1082         9053             2018-03-01 23:22:55.0000000        1
     4           1082         9535             2018-03-01 23:23:30.0000000        1
     5           1196         3145             2018-03-05 07:27:15.0000000        2
     6           1196         3162             2018-03-05 07:27:50.0000000        2
     7           1196         3175             2018-03-05 07:28:24.0000000        2
     8           1196         3235             2018-03-05 07:28:57.0000000        2
     9           1196         3295             2018-03-05 07:29:31.0000000        2
     10          1196         3448             2018-03-05 07:30:04.0000000        2
     11          1196         3465             2018-03-05 07:30:37.0000000        2
     12          1196         4265             2018-03-05 07:31:09.0000000        2
     13          1196         3495             2018-03-05 17:13:13.0000000        3
     14           473          551             2018-03-02 16:43:52.0000000        4
     15           473          590             2018-03-02 16:44:30.0000000        4
     16           473          835             2018-03-02 16:45:02.0000000        4

请注意RN从2更改为3. id相同,但datediff超过60秒。

到目前为止,我已尝试过此查询。我的原始数据集位于dbo.myTable

select 
        a.id as A_ID,
        a.code as A_Code,
        a.create_date as A_CreateDate,
        a.RN as A_RN,
        _.id as _ID,
        _.code as _Code,
        _.create_date as _CreateDate,
        _.RN as _RN
    from myTable a
    cross apply
    (
        select * from myTable b
        where a.id=b.id and (a.ukey=b.ukey-1 or a.ukey=1) and datediff(ss,a.create_date,b.create_date)<60
    )_
    order by a.id, a.create_date

此查询未给出预期结果。我的想法是将每一行与前一行进行比较,并检查这两个条件。

1 个答案:

答案 0 :(得分:3)

这听起来像你想要lag()和累积总和:

select t.*,
       sum(case when prev_create_date > dateadd(second, -60, create_date)
                then 0 else 1
           end) over
           (order by ukey) as rn
from (select t.*,
             lag(create_date) over (partition by id order by create_date) as prev_create_date
      from t
     ) t;