查询以查找连续行的范围

时间:2015-03-26 17:50:11

标签: sql gaps-and-islands set-based

我的文件包含一个包含2列的SQL表的转储:int ID(自动增量标识字段)和位标志。 flag = 0表示记录良好,flag = 1表示记录错误(包含错误)。目标是找到包含1,000行或更多行的所有连续错误记录块(标志值为1)。解决方案不应该使用游标或while循环,它应该只使用基于集合的查询(选择,连接等)。 我们希望看到使用的实际查询和结果采用以下格式:

StartID – EndID    NumberOfErrorsInTheBlock
StartID – EndID    NumberOfErrorsInTheBlock
……………………….
StartID – EndID    NumberOfErrorsInTheBlock

例如,如果我们的数据只有30条记录而且我们正在查找包含5条或更多记录的块,那么结果将如下所示(请参阅下面的屏幕截图,突出显示符合条件的错误块):

[ID范围] ..... [块中的错误数]

11-15 ..... 5

19-25 ..... 7

sql file containing sample rows, dropbox

2 个答案:

答案 0 :(得分:0)

SQL Server 2012及更高版本的T-SQL解决方案

IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
    DROP TABLE #tbl_ranges;

CREATE TABLE #tbl_ranges
(
    row_num INT PRIMARY KEY,
    ID INT,
    Flag BIT,
    Label TINYINT
);

WITH cte_yourTable
AS
(
    SELECT  Id,
            Flag,
            CASE
                --label min
                WHEN Flag != LAG(flag,1) OVER (ORDER BY ID) THEN 1
                --inner 
                WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag = LEAD(flag,1) OVER (ORDER BY ID) THEN 2
                --end
                WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag != LEAD(flag,1) OVER (ORDER BY ID) THEN 3
            END label
    FROM yourTable
)

INSERT INTO #tbl_ranges
    SELECT  ROW_NUMBER() OVER (ORDER BY ID) row_num,
            ID,
            Flag,
            label
    FROM cte_yourTable
    WHERE label != 2;

SELECT  A.ID ID_start,
        B.ID ID_end,
        B.ID - A.ID range_cnt
FROM #tbl_ranges A
INNER JOIN #tbl_ranges B
ON A.row_num = B.row_num - 1
AND A.Flag = B.Flag;

IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
    DROP TABLE #tbl_ranges;

缩略结果:

ID_start    ID_end      range_cnt
----------- ----------- -----------
2           3           1
5           8           3
9           10          1
11          35          24
36          356         320
357         358         1
359         406         47
...

答案 1 :(得分:0)

没有使用临时表,这是最好的解决方案,这是答案,它是CTE与CTE(嵌套CTE)的完美范例

With Evaluation (ID,Flag,Evaluate)
as 
(select ID,Flag,Evaluate = ID-row_number() over (order by Flag,ID)
    from [dbo].[SqltestRecordsNew]
    where Flag = 1
    ),
Evaluation_Final (StartingRecordID,EndRecordID,Flag,cnt)
as
(
    select min(ID) as StartingRecordID,max(ID) as EndRecordID, 
    Flag, cnt = count(*)
    from    Evaluation
    group by Evaluate, Flag
)
select Concat(StartingRecordID,' - ', EndRecordID) as 'StartingRecordID - EndRecordId',
cnt as GroupItemCnt from Evaluation_Final
where cnt > 999
order by Concat(StartingRecordID,' - ', EndRecordID) 


-- Test results Case 1
Select ID,Flag, 
Case when Flag=1 then 'Success'
     else 'Defect Data'
End as TestResults
from SqltestRecordsNew
where ID between 1494363 and 1495559

-- Test results Case 2
Select ID,Flag,
Case when Flag=1 then 'Success'
     else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1498409 and 1503899

-- Test results Case 3
Select ID,Flag,
Case when Flag=1 then 'Success'
     else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1548257 and 1550489