假设我有这张表:
select * from window_test;
k | v
---+---
a | 1
a | 2
b | 3
a | 4
最终我想得到:
k | min_v | max_v
---+-------+-------
a | 1 | 2
b | 3 | 3
a | 4 | 4
但我会很高兴得到这个(因为我可以使用distinct
轻松过滤它):
k | min_v | max_v
---+-------+-------
a | 1 | 2
a | 1 | 2
b | 3 | 3
a | 4 | 4
使用PostgreSQL 9.1+窗口函数可以实现这一点吗?我试图了解我是否可以使用单独的分区来查看此示例中k=a
的第一次和最后一次出现(按v
排序)。
答案 0 :(得分:11)
这将使用样本数据返回您想要的结果。不确定它是否适用于现实世界的数据:
select k,
min(v) over (partition by group_nr) as min_v,
max(v) over (partition by group_nr) as max_v
from (
select *,
sum(group_flag) over (order by v,k) as group_nr
from (
select *,
case
when lag(k) over (order by v) = k then null
else 1
end as group_flag
from window_test
) t1
) t2
order by min_v;
我遗漏了DISTINCT
。
答案 1 :(得分:1)
编辑:我想出了以下查询 - 根本没有窗口函数:
WITH RECURSIVE tree AS (
SELECT k, v, ''::text as next_k, 0 as next_v, 0 AS level FROM window_test
UNION ALL
SELECT c.k, c.v, t.k, t.v + level, t.level + 1
FROM tree t JOIN window_test c ON c.k = t.k AND c.v + 1 = t.v),
partitions AS (
SELECT t.k, t.v, t.next_k,
coalesce(nullif(t.next_v, 0), t.v) AS next_v, t.level
FROM tree t
WHERE NOT EXISTS (SELECT 1 FROM tree WHERE next_k = t.k AND next_v = t.v))
SELECT min(k) AS k, v AS min_v, max(next_v) AS max_v
FROM partitions p
GROUP BY v
ORDER BY 2;
我现在提供了两个有效的查询,我希望其中一个可以帮助你。
对于此变体,如何实现这一目标的另一种方法是使用支持序列。
创建支持序列:
CREATE SEQUENCE wt_rank START WITH 1;
查询:
WITH source AS (
SELECT k, v,
coalesce(lag(k) OVER (ORDER BY v), k) AS prev_k
FROM window_test
CROSS JOIN (SELECT setval('wt_rank', 1)) AS ri),
ranking AS (
SELECT k, v, prev_k,
CASE WHEN k = prev_k THEN currval('wt_rank')
ELSE nextval('wt_rank') END AS rank
FROM source)
SELECT r.k, min(s.v) AS min_v, max(s.v) AS max_v
FROM ranking r
JOIN source s ON r.v = s.v
GROUP BY r.rank, r.k
ORDER BY 2;
答案 2 :(得分:0)
这不会为您完成工作,无需窗口,分区或合并。它只是使用传统的SQL技巧通过自联接找到最近的元组,并在差异上找到最小值:
SELECT k, min(v), max(v) FROM (
SELECT k, v, v + min(d) lim FROM (
SELECT x.*, y.k n, y.v - x.v d FROM window_test x
LEFT JOIN window_test y ON x.k <> y.k AND y.v - x.v > 0)
z GROUP BY k, v, n)
w GROUP BY k, lim ORDER BY 2;
我认为这可能是一个更关键的问题。解决方案,但我不确定它的效率。