合并区间数据组 - SQL Server

时间:2016-07-04 07:03:43

标签: sql sql-server sql-server-2008

我有两组间隔数据I.E。

Start End Type1 Type2
0     2   L     NULL
2     5   L     NULL
5     7   L     NULL
7     10  L     NULL
2     3   NULL  S
3     5   NULL  S
5     8   NULL  S
11    12  NULL  S

我想做的是将这些集合合为一个。这似乎可以通过利用岛和间隙解决方案,但由于间隔的不连续性,我不确定如何应用它...我预期的输出将是:

Start End Type1 Type2
0     2   L     NULL
2     3   L     S
3     5   L     S
5     7   L     S
7     8   L     S
8     10  L     NULL
11    12  NULL  S

之前有人这样做过吗?谢谢!

在下面创建脚本:

CREATE TABLE Table1
    ([Start] int, [End] int, [Type1] varchar(4), [Type2] varchar(4))
;

INSERT INTO Table1
    ([Start], [End], [Type1], [Type2])
VALUES
    (0, 2, 'L', NULL),
    (2, 3, NULL, 'S'),
    (2, 5, 'L', NULL),
    (3, 5, NULL, 'S'),
    (5, 7, 'L', NULL),
    (5, 8, NULL, 'S'),
    (7, 10, 'L', NULL),
    (11, 12, NULL, 'S')
;

3 个答案:

答案 0 :(得分:1)

我认为Start具有包容性,End是独占的,且给定的间隔不重叠。

CTE_Number是一个数字表。在这里它是动态生成的。我把它作为我数据库中的永久表。

CTE_T1CTE_T2使用数字表将每个间隔扩展为相应的行数。例如,时间间隔[2,5)会生成Values

2
3
4

这是两次:Type1Type2

Type1Type2FULL JOINed的结果Value {/ 1}}。

最后,间隙和岛屿传递组/折叠间隔。

逐步运行查询,CTE-by-CTE并检查中间结果以了解其工作原理。

示例数据

我添加了几行来说明值之间存在差距的情况。

DECLARE @Table1 TABLE
    ([Start] int, [End] int, [Type1] varchar(4), [Type2] varchar(4))
;

INSERT INTO @Table1 ([Start], [End], [Type1], [Type2]) VALUES
( 0,  2, 'L', NULL),
( 2,  3, NULL, 'S'),
( 2,  5, 'L', NULL),
( 3,  5, NULL, 'S'),
( 5,  7, 'L', NULL),
( 5,  8, NULL, 'S'),
( 7, 10, 'L', NULL),
(11, 12, NULL, 'S'),

(15, 20, 'L', NULL),
(15, 20, NULL, 'S');

<强>查询

WITH 
e1(n) AS
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) -- 10
,e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b) -- 10*10
,e3(n) AS (SELECT 1 FROM e1 CROSS JOIN e2) -- 10*100
,CTE_Numbers
AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY n) AS Number
    FROM e3
)
,CTE_T1
AS
(
    SELECT
        T1.[Start] + CA.Number - 1 AS Value
        ,T1.Type1
    FROM
        @Table1 AS T1
        CROSS APPLY
        (
            SELECT TOP(T1.[End] - T1.[Start]) CTE_Numbers.Number
            FROM CTE_Numbers
            ORDER BY CTE_Numbers.Number
        ) AS CA
    WHERE
        T1.Type1 IS NOT NULL
)
,CTE_T2
AS
(
    SELECT
        T2.[Start] + CA.Number - 1 AS Value
        ,T2.Type2
    FROM
        @Table1 AS T2
        CROSS APPLY
        (
            SELECT TOP(T2.[End] - T2.[Start]) CTE_Numbers.Number
            FROM CTE_Numbers
            ORDER BY CTE_Numbers.Number
        ) AS CA
    WHERE
        T2.Type2 IS NOT NULL
)
,CTE_Values
AS
(
    SELECT
        ISNULL(CTE_T1.Value, CTE_T2.Value) AS Value
        ,CTE_T1.Type1
        ,CTE_T2.Type2
        ,ROW_NUMBER() OVER (ORDER BY ISNULL(CTE_T1.Value, CTE_T2.Value)) AS rn
    FROM
        CTE_T1
        FULL JOIN CTE_T2 ON CTE_T2.Value = CTE_T1.Value
)
,CTE_Groups
AS
(
    SELECT
        Value
        ,Type1
        ,Type2
        ,rn
        ,ROW_NUMBER() OVER 
            (PARTITION BY rn - Value, Type1, Type2 ORDER BY Value) AS rn2
    FROM CTE_Values
)
SELECT
    MIN(Value) AS [Start]
    ,MAX(Value) + 1 AS [End]
    ,Type1
    ,Type2
FROM CTE_Groups
GROUP BY rn-rn2, Type1, Type2
ORDER BY [Start];

<强>结果

+-------+-----+-------+-------+
| Start | End | Type1 | Type2 |
+-------+-----+-------+-------+
|     0 |   2 | L     | NULL  |
|     2 |   8 | L     | S     |
|     8 |  10 | L     | NULL  |
|    11 |  12 | NULL  | S     |
|    15 |  20 | L     | S     |
+-------+-----+-------+-------+

答案 1 :(得分:0)

一步一步的方法是:

-- Finding all break points
;WITH breaks AS (
    SELECT Start
    FROM yourTable
    UNION 
    SELECT [End]
    FROM yourTable
) -- Finding Possible Ends
, ends AS (
    SELECT Start
        , (SELECT Min([End]) FROM yourTable WHERE yourTable.Start = breaks.Start) End1
        , (SELECT Max([End]) FROM yourTable WHERE yourTable.Start < breaks.Start) End2
    FROM breaks
) -- Finding periods
, periods AS (
    SELECT Start, 
        CASE 
            WHEN End1 > End2 And End2 > Start THEN End2
            WHEN End1 IS NULL THEN End2
            ELSE End1
        END [End]
    FROM Ends
    WHERE NOT(End1 IS NULL AND Start = End2)
) -- Generating results
SELECT p.Start, p.[End], Max(Type1) Type1, Max(Type2) Type2
FROM periods p, yourTable t
WHERE p.start >= t.Start AND p.[End] <= t.[End]
GROUP BY p.Start, p.[End];

在上面的查询中,某些情况可能不适合分析所有情况,您可以根据需要进行改进;)。

答案 2 :(得分:0)

首先通过联盟获取所有开始和结束的数字 然后将这些数字加入&#39; L&#39;和&#39; S&#39;记录。

使用表变量进行测试。

DECLARE @Table1 TABLE (Start int, [End] int, Type1 varchar(4), Type2 varchar(4));

INSERT INTO @Table1 (Start, [End], Type1, Type2) 
VALUES (0, 2, 'L', NULL),(2, 3, NULL, 'S'),(2, 5, 'L', NULL),(3, 5, NULL, 'S'),
(5, 7, 'L', NULL),(5, 8, NULL, 'S'),(7, 10, 'L', NULL),(11, 12, NULL, 'S');

select 
n.Num as Start,
(case when s.[End] is null or l.[End] <= s.[End] then l.[End] else s.[End] end) as [End],
l.Type1, 
s.Type2
from
(select Start as Num from @Table1 union select [End] from @Table1) n
left join @Table1 l on (n.Num >= l.Start and n.Num < l.[End] and l.Type1 = 'L')
left join @Table1 s on (n.Num >= s.Start and n.Num < s.[End] and s.Type2 = 'S')
where (l.Start is not null or s.Start is not null)
order by Start, [End];

输出:

Start End Type1 Type2
0     2   L     NULL
2     3   L     S
3     5   L     S
5     7   L     S
7     8   L     S
8     10  L     NULL
11    12  NULL  S