我在SQL Server中有一个简单的数据集,如下所示
**ROW Start End**
0 1 2
1 3 5
2 4 6
3 8 9
图形上,数据看起来像这样
我想要实现的是折叠重叠数据,以便我的查询返回
**ROW Start End**
0 1 2
1 3 6
2 8 9
这是否可以在SQL Server中编写,而无需编写复杂的过程或语句?
答案 0 :(得分:2)
这是另一种选择的 SQL Fiddle 。
首先,所有限制都按顺序排序。然后删除重叠范围内的“重复”限制(因为“开始”后跟另一个“开始”或“结束”后跟另一个“结束”)。现在,范围已折叠,Start和End值将在同一行中再次写出。
with temp_positions as --Select all limits as a single column along with the start / end flag (s / e)
(
select startx limit, 's' as pos from t
union
select endx, 'e' as pos from t
)
, ordered_positions as --Rank all limits
(
select limit, pos, RANK() OVER (ORDER BY limit) AS Rank
from temp_positions
)
, collapsed_positions as --Collapse ranges (select the first limit, if s is preceded or followed by e, and the last limit) and rank limits again
(
select op1.*, RANK() OVER (ORDER BY op1.Rank) AS New_Rank
from ordered_positions op1
inner join ordered_positions op2
on (op1.Rank = op2.Rank and op1.Rank = 1 and op1.pos = 's')
or (op2.Rank = op1.Rank-1 and op2.pos = 'e' and op1.pos = 's')
or (op2.Rank = op1.Rank+1 and op2.pos = 's' and op1.pos = 'e')
or (op2.Rank = op1.Rank and op1.pos = 'e' and op1.Rank = (select max(Rank) from ordered_positions))
)
, final_positions as --Now each s is followed by e. So, select s limits and corresponding e limits. Rank ranges
(
select cp1.limit as cp1_limit, cp2.limit as cp2_limit, RANK() OVER (ORDER BY cp1.limit) AS Final_Rank
from collapsed_positions cp1
inner join collapsed_positions cp2
on cp1.pos = 's' and cp2.New_Rank = cp1.New_Rank+1
)
--Finally, subtract 1 from Rank to start Range #'s from 0
select fp.Final_Rank-1 seq_no, fp.cp1_limit as starty, fp.cp2_limit as endy
from final_positions fp;
您可以测试每个CTE的结果并跟踪进展。您可以通过删除以下CTE并从前一个CTE中进行选择来完成此操作,例如,如下所示。
with temp_positions as --Select all limits as a single column along with the start / end flag (s / e)
(
select startx limit, 's' as pos from t
union
select endx, 'e' as pos from t
)
, ordered_positions as --Rank all limits
(
select limit, pos, RANK() OVER (ORDER BY limit) AS Rank
from temp_positions
)
select *
from ordered_positions;
答案 1 :(得分:1)
执行此操作的关键是为重叠段指定“分组”值。然后,您可以通过此列进行聚合以获取所需的信息。当一个段与前一个段不重叠时,它会启动一个组。
with starts as (
select t.*,
(case when exists (select 1 from table t2 where t2.start < t.start and t2.end >= .end)
then 0
else 1
end) as isstart
from table t
),
groups as (
select s.*,
(select sum(isstart)
from starts s2
where s2.start <= s.start
) as grouping
from starts s
)
select row_number() over (order by min(start)) as row,
min(start) as start, max(end) as end
from groups
group by grouping;
答案 2 :(得分:0)
我会创建一个返回段的表值函数。然后你会称之为:
select *
from dbo.getCollapsedSegments(2, 9)
这是一个例子(我用FIN替换了END,因为END是一个保留字。)
CREATE FUNCTION dbo.getCollapsedSegments(@Start int, @Fin int)
RETURNS @CollapsedSegments TABLE
(
-- Columns returned by the function
start int,
fin int
)
AS
BEGIN
SELECT @Start = (SELECT MIN(Start) FROM data WHERE @Start <= Start)
WHILE (@Start IS NOT NULL AND @Start < @Fin)
BEGIN
INSERT INTO @CollapsedSegments
SELECT MIN(s1.Start), MAX(ISNULL(s2.Fin, s1.Fin))
FROM data s1
LEFT JOIN data s2
ON s1.Start < s2.Fin
AND s2.Start <= s1.Fin
AND @Fin > s2.start
WHERE s1.Start <= @Start
AND @Start < s1.Fin
SELECT @Start = (SELECT MAX(Fin) FROM @CollapsedSegments)
SELECT @Start = MIN(Start)
FROM data
WHERE Start > @Start
END
RETURN;
END
我的测试数据:
create table data
(start int,
fin int)
insert into data
select 1, 2
union all
select 3, 5
union all
select 4, 6
union all
select 8, 9
union all
select 10, 11