折叠SQL Server 2014中的数据范围和重叠数据

时间:2015-06-18 21:13:11

标签: tsql sql-server-2014

我有一个表格,其中的范围如下:

ID  ActionCode  Group1  Type    Low         High
33  A           840     MM      000295800   000295899
34  A           840     MM      000295900   000295999

我需要将连续数据的两行折叠成一行,例如上面的行

ActionCode  Group1  Type    Low         High
A           840     MM      000295800   000295999   

对于ActionCode,Group1,Type ...

可能存在重叠的数据范围,前面为零等。

样本表:

IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable

CREATE TABLE #TestTable(
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [ActionCode] [char](1) NOT NULL,
    [Group1] [varchar](50) NOT NULL,
    [Type] [varchar](2) NULL,
    [Low] [varchar](50) NOT NULL,
    [High] [varchar](50) NOT NULL,
    CONSTRAINT [PK_#TestTable] PRIMARY KEY CLUSTERED ([ID] ASC)  
) 

GO

INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299870','401299879')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','AA','401644000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401378000','401378999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401644000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299970','401299979')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','400424000','400424999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299990','401299996')
-- Ds
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401198000','401198999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401649000','401649999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401299997','401299997')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','401376000','401390999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401655000','401668999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','400411000','400411999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','400414000','400414999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401646000','401646999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('D','840','JJ','400413000','400413999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','JJ','401654000','401654999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892000','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892300','522892399')
-- Different Types overlap range
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892000','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892300','522892399')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892400','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','KK','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892200','522892999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','GG','522892300','522892399')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892100','522892199')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','356','AA','522892000','522892999')
-- Leading Zeros
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','MM','000295800','000295899')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','MM','000295900','000295999')
-- Overlap
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295800','623295999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295900','623295999')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623295900','623296099')
INSERT INTO #TestTable (ActionCode, Group1, Type, Low, High) VALUES ('A','840','NN','623296100','623296299')
GO


SELECT * FROM #TestTable ORDER BY Low

我可以使用递归CTE的小表来执行此操作,但该表的行少于一百万行。但是一旦我超过一定的尺寸,它需要很长时间才能运行。 “分组”列上有一个索引。

必须有办法快速做到这一点,我只是遇到了障碍。

^

2 个答案:

答案 0 :(得分:0)

我认为你正在寻找这样的东西:

WITH CTE
AS
(
    SELECT  ActionCode,
            Group1,
            [Type],
            Low,
            High,
            next_low =  LEAD(low,1) OVER (PARTITION BY ActionCode,Group1,[Type] ORDER BY ID),
            next_high = LEAD(high,1) OVER (PARTITION BY ActionCode,Group1,[Type] ORDER BY ID)
    FROM #testTable
)

SELECT  ActionCode,
        Group1,
        [Type],
        Low,
        High
FROM CTE
WHERE       low != next_low 
        AND high!= next_high

答案 1 :(得分:0)

我会假设你的低/高都是整数,所以可能需要稍微调整一下你认为的高低。

我也会开始假装我没有看到连续的部分,在这种情况下,一个简单的小组处理它:

for xml path('')

假设您确实根据ID确定了每个连续组,这就成为一个典型的“缺口和孤岛”问题,可以通过行号与ID进行解决:

SELECT
    ActionCode
    ,Group1
    ,Type
    ,min(convert(int,Low)) AS Low
    ,max(convert(int,High)) AS High
FROM #TestTable
GROUP BY
    ActionCode
    ,Group1
    ,Type