我在Oracle中有一个包含3列c1,c2,c3的表,如下所示:
c1 c2 c3
1 34 2
2 34 2
3 34 2
4 24 2
5 24 2
6 34 2
7 34 2
8 34 1
我需要对col1
进行分组,并根据其序列col1
和col2
获取最小和最大数量(col3
)。
即,我需要如下结果:
c1_min c1_max c2 c3
1 3 34 2
4 5 24 2
6 7 34 2
8 8 34 1
答案 0 :(得分:3)
有多种方法可以接近gaps-and-islands problem。作为Sylvain的lag
版本的替代品 - 不是更好,只是不同 - 您可以使用根据您的分组字段分析计算行数的技巧。这增加了一个“链条”。 psuedcolumn到表值,对于每个连续的c2
/ c3
对组都是唯一的:
select c1, c2, c3,
dense_rank() over (partition by c2, c3 order by c1)
- dense_rank() over (partition by null order by c1) as chain
from t42
order by c1, c2, c3;
(我不能相信这一点 - 我第一次看到它here)。然后,您可以将其用作内联视图来计算总和:
select min(c1) as c1_min, max(c1) as c1_max, c2, c3
from (
select c1, c2, c3,
dense_rank() over (partition by c2, c3 order by c1)
- dense_rank() over (partition by null order by c1) as chain
from t42
)
group by c2, c3, chain
order by c1_min;
C1_MIN C1_MAX C2 C3
---------- ---------- ---------- ----------
1 3 34 2
4 5 24 2
6 7 34 2
8 8 34 1
SQL Fiddle也显示了中间阶段。
您可以使用其他分析函数,例如row_number()
而不是dense_rank()
;对于某些数据,它们可能会给出稍微不同的结果,但您会获得same result with this sample。
答案 1 :(得分:2)
如果我理解得很清楚,您希望将连续的行组合在一起。这远非微不足道。或者至少,我现在无法找到简单的方式。为了便于理解,我将分几个步骤打破查询:
首先要确定你的"群组"边界。使用LAG
分析函数可能会对您有所帮助:
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
ORDER BY "c1"
第二步是为每个组编号。一个简单的SUM
over分区就可以了。这导致:
SELECT SUM(CLK) OVER (ORDER BY "c1"
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) GRP,
V.*
FROM (
SELECT
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
) V
ORDER BY "c1";
最后,您可以将其包装在简单的GROUP BY
查询中以获得所需的输出:
SELECT MIN("c1"), MAX("c1"), "c2", "c3" FROM
(
SELECT SUM(CLK) OVER (ORDER BY "c1"
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) GRP,
V.*
FROM (
SELECT
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
) V
)
GROUP BY GRP, "c2", "c3"
ORDER BY GRP