找出模式的序列号

时间:2017-02-01 06:27:03

标签: sql sql-server tsql

考虑下表

ActivityId      Flag       Type
----------      -----      -----
1               N                              
2               N 
3               Y           EXT
4               Y
5               Y           
6               N
7               Y           INT
8               Y      
9               N
10              N
11              N
12              Y           EXT
13              N             
14              N
15              N
16              Y           EXT
17              Y           
18              Y           INT
19              Y           
20              Y           EXT
21              Y
22              N
23              N      

第一条记录始终为Flag = N,然后下一条记录可以存在Flag = YFlag = N的任何序列。每次标记从N更改为Y时,Type字段都是EXTINT。下一个Y记录(在下一个N之前)可能有Type = EXTINT或NULL,这并不重要。

我想为此Cycle No序列计算N/Y。第一个周期从Flag = N开始(总是第一个记录有flag = N),当标志更改为YType = EXT时,周期结束。然后,当标志更改为N时,下一个周期开始,当标志变为Ytype = EXT时结束。重复此过程,直到处理完所有记录。上表的结果是:

结果

ActivityId      Flag       Type      Cycle No
----------      -----      -----     --------
1               N                       1       
2               N                       1
3               Y           EXT         1
4               Y
5               Y           
6               N                       2
7               Y           INT         2
8               Y                       2
9               N                       2
10              N                       2
11              N                       2
12              Y           EXT         2
13              N                       3
14              N                       3 
15              N                       3
16              Y           EXT         3
17              Y           
18              Y           INT
19              Y           
20              Y           EXT
21              Y
22              N                       4
23              N                       4 

我使用的是SQL Server 2008 R2(无LAG / LEAD)。 你能帮我找一下SQL查询来计算Cycle No吗?

2 个答案:

答案 0 :(得分:3)

我有一个解决方案,它并不漂亮,但通过逐步细化,我得到了你的结果:

解决方案分为三个步骤

  1. 隔离所有可能的开始周期和结束的activityid 循环行。
  2. 过滤掉所有dud开始事件
  3. 对周期进行编号,并找到每个周期的activityid 周期
  4. 首先,我选择所有开始和结束周期事件:

    with tab as
    (select * from (values
    (1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
    ,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
    ,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
    ,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
    ,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
    ,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
    
    ,CTE1 as
    ( select
        ROW_NUMBER() over (order by t1.ActivityId) rn
        ,t1.ActivityId
        ,case when t1.Flag='N' then 'Start' else 'End' end Cycle
    from tab t1
    where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
    )
    select * from cte1
    

    返回

    rn  ActivityId  Cycle
    1   1   Start
    2   2   Start
    3   3   End
    4   6   Start
    5   9   Start
    6   10  Start
    7   11  Start
    8   12  End
    9   13  Start
    10  14  Start
    11  15  Start
    12  16  End
    13  20  End
    14  22  Start
    15  23  Start
    

    现在的问题是,虽然我们确定循环何时结束,即当Flag为N且Type为Ext时,我们不确定循环何时开始。第1行和第2行都表示可能的启动事件。但幸运的是,我们可以看到只计算End事件后的启动事件。由于我们没有滞后或领先,我们必须加入CTE:

    ,CTE2 as
    (
    select ROW_NUMBER() over (order by a1.activityid) rn
           ,a1.ActivityId
           ,a1.Cycle
           ,a2.Cycle PrevCycle
    from  CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1 
    where 
    
        a2.Cycle is null -- First Cycle
        or
        ( a2.Cycle is not null
        and
        ( 
            (a1.Cycle='End' and a2.Cycle='Start') -- End of cycle
            or 
            (a1.Cycle='Start'  
             and a2.Cycle='End') -- next cycles
        )
        )
        )
    select * from cte2
    

    返回

    rn  ActivityId  Cycle   PrevCycle
    1   1   Start   NULL
    2   3   End Start
    3   6   Start   End
    4   12  End Start
    5   13  Start   End
    6   16  End Start
    7   22  Start   End
    

    我选择第一个开始事件 - 因为我们总是从一个开始,然后保持开始事件后面的END事件。 最后,如果前一个事件是End事件,我们只保留其余的开始事件。

    现在我们可以找到每个周期的开始和结束,并给它们编号:

    ,cte3 as
    (
    select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
        ,b1.ActivityId StartId,b2.ActivityId EndId
    from cte2 b1 left join cte2 b2
    on b1.rn=b2.rn-1
    where b1.Cycle='Start'
    )
    select * from cte3
    

    这给了我们所需要的东西:

    CycleNumber StartId EndId
    1   1   3
    2   6   12
    3   13  16
    4   22  NULL
    

    现在我们只需将这个加入我们的桌子:

    select
    a.ActivityId,a.Flag,a.[Type],CycleNumber
    from tab a
    left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
    

    这给出了您正在寻找的结果。

    这只是一个快速而肮脏的解决方案,也许只需要一点TLC就可以了,并减少步数。

    完整的解决方案在这里:

    with tab as
    (select * from (values
    (1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
    ,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
    ,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
    ,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
    ,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
    ,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
    
    ,CTE1 as
    ( select
        ROW_NUMBER() over (order by t1.ActivityId) rn
        ,t1.ActivityId
        ,case when t1.Flag='N' then 'Start' else 'End' end Cycle
    from tab t1
    where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
    )
    ,CTE2 as
    (
    select ROW_NUMBER() over (order by a1.activityid) rn
           ,a1.ActivityId
           ,a1.Cycle
           ,a2.Cycle PrevCycle
    from  CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1 
    where 
    
        a2.Cycle is null -- First Cycle
        or
        ( a2.Cycle is not null
        and
        ( 
            a1.Cycle='End' -- End of cycle
            or 
            (a1.Cycle='Start'  
             and a2.Cycle='End') -- next cycles
        )
        )
        )
    
    ,cte3 as
    (
    select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
        ,b1.ActivityId StartId,b2.ActivityId EndId
    from cte2 b1 left join cte2 b2
    on b1.rn=b2.rn-1
    where b1.Cycle='Start'
    )
    
    select
    a.ActivityId,a.Flag,a.[Type],CycleNumber
    from tab a
    left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
    

答案 1 :(得分:1)

如果您对递归感到满意,那么当您的ActivityId订购时,可以通过对前一行的一些比较逻辑来实现这一点:

declare @t table(ActivityId int,Flag nvarchar(1),TypeValue nvarchar(3));
insert into @t values(1 ,'N',null),(2 ,'N',null),(3 ,'Y','EXT'),(4 ,'Y',null),(5 ,'Y',null),(6 ,'N',null),(7 ,'Y','INT'),(8 ,'Y',null),(9 ,'N',null),(10,'N',null),(11,'N',null),(12,'Y','EXT'),(13,'N',null),(14,'N',null),(15,'N',null),(16,'Y','EXT'),(17,'Y',null),(18,'Y','INT'),(19,'Y',null),(20,'Y','EXT'),(21,'Y',null),(22,'N',null),(23,'N',null);

with rn as    -- Derived table purely to guarantee incremental row number.  If you can guarantee your ActivityId values are incremental start to finish, this isn't required.
(   select row_number() over (order by ActivityId) as rn
            ,ActivityId
            ,Flag
            ,TypeValue
    from @t
),d as
(   select rn               -- Recursive CTE that compares the current row to the one previous.
            ,ActivityId
            ,Flag
            ,TypeValue
            ,cast(1 as decimal(10,5)) as CycleNo
    from rn
    where rn = 1

    union all

    select rn.rn
            ,rn.ActivityId
            ,rn.Flag
            ,rn.TypeValue
            ,cast(
                case when d.Flag = 'Y' and d.TypeValue = 'EXT' and d.CycleNo >= 1
                        then case when rn.Flag = 'N'
                                    then d.CycleNo + 1
                                    else (d.CycleNo + 1) * 0.0001    -- This part keeps track of the cycle number in fractional values, which can be removed by converting the final result to INT.
                                    end
                        else case when rn.Flag = 'N' and d.CycleNo < 1
                                    then d.CycleNo * 10000
                                    else d.CycleNo
                                    end
                        end
            as decimal(10,5)) as CycleNo
    from rn
        inner join d
            on d.rn = rn.rn - 1
)
select ActivityId
    ,Flag
    ,TypeValue
    ,cast(CycleNo as int) as CycleNo
from d
order by ActivityId;

输出:

+------------+------+-----------+---------+
| ActivityId | Flag | TypeValue | CycleNo |
+------------+------+-----------+---------+
|          1 | N    | NULL      |       1 |
|          2 | N    | NULL      |       1 |
|          3 | Y    | EXT       |       1 |
|          4 | Y    | NULL      |       0 |
|          5 | Y    | NULL      |       0 |
|          6 | N    | NULL      |       2 |
|          7 | Y    | INT       |       2 |
|          8 | Y    | NULL      |       2 |
|          9 | N    | NULL      |       2 |
|         10 | N    | NULL      |       2 |
|         11 | N    | NULL      |       2 |
|         12 | Y    | EXT       |       2 |
|         13 | N    | NULL      |       3 |
|         14 | N    | NULL      |       3 |
|         15 | N    | NULL      |       3 |
|         16 | Y    | EXT       |       3 |
|         17 | Y    | NULL      |       0 |
|         18 | Y    | INT       |       0 |
|         19 | Y    | NULL      |       0 |
|         20 | Y    | EXT       |       0 |
|         21 | Y    | NULL      |       0 |
|         22 | N    | NULL      |       4 |
|         23 | N    | NULL      |       4 |
+------------+------+-----------+---------+