我有下表:
Row Column Type
1 1 =
1 2 =
1 3 O
1 4 =
1 5 =
1 6 O
2 1 =
我需要得到类似的东西
Row Start_Column End_Column Type
1 1 2 =
1 3 3 O
1 4 5 =
1 6 6 O
2 1 1 =
我尝试对其进行分组,以使用 ROW_NUMBER、RANK 进行操作,但没有运气
有人知道如何做到这一点吗?
答案 0 :(得分:2)
您可以使用 LAG()
和 SUM()
窗口函数来创建您想要的组然后聚合:
SELECT [Row],
MIN([Column]) Start_Column,
MAX([Column]) End_Column,
MAX([Type]) [Type]
FROM (
SELECT *, SUM(flag) OVER (PARTITION BY [Row] ORDER BY [Column]) grp
FROM (
SELECT *,
CASE WHEN [Type] = LAG([Type]) OVER (PARTITION BY [Row] ORDER BY [Column]) THEN 0 ELSE 1 END flag
FROM tablename
) t
) t
GROUP BY [Row], grp
ORDER BY [Row], grp
参见demo。
结果:
Row | Start_Column | End_Column | Type
--- | ------------ | ---------- | ----
1 | 1 | 2 | =
1 | 3 | 3 | O
1 | 4 | 5 | =
1 | 6 | 6 | O
2 | 1 | 1 | =
答案 1 :(得分:2)
这是一种间隙和孤岛问题。这种情况下,最简单的方法大概就是行号差了:
select row, type, min(column), max(column)
from (select t.*,
row_number() over (partition by row, type order by column) as seqnum_2,
row_number() over (partition by row order by column) as seqnum
from t
) t
group by row, type, (seqnum - seqnum_2)
order by row, min(column);
如果 column
是连续的且没有间隙,您可以进一步简化:
select row, type, min(column), max(column)
from (select t.*,
row_number() over (partition by row, type order by column) as seqnum_2
from t
) t
group by row, type, (column - seqnum_2)
order by row, min(column);
为什么会这样?好吧,如果你从 column
中减去一个递增的序列,那么结果是常数——当类型相同时。
Here 是一个 db<>fiddle。