我的数据以环形结构(或circular buffer)排列,也就是说它可以表示为循环的序列:...- 1-2-3-4-5-1-2 -3 -....见图片,了解5部分戒指:
我想创建一个窗口查询,可以将滞后和铅项目组合成一个三点数组,但我无法理解。例如,在5部分环的第1部分,滞后/超前序列是5-1-2,或者部分4是3-4-5。
以下是两个具有不同数量部件的环的示例表(每个环总是多于三个):
create table rp (ring int, part int);
insert into rp(ring, part) values(1, generate_series(1, 5));
insert into rp(ring, part) values(2, generate_series(1, 7));
这是一个近乎成功的查询:
SELECT ring, part, array[
lag(part, 1, NULL) over (partition by ring),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
ring | part | neighbours
------+------+------------
1 | 1 | {NULL,1,2}
1 | 2 | {1,2,3}
1 | 3 | {2,3,4}
1 | 4 | {3,4,5}
1 | 5 | {4,5,1}
2 | 1 | {NULL,1,2}
2 | 2 | {1,2,3}
2 | 3 | {2,3,4}
2 | 4 | {3,4,5}
2 | 5 | {4,5,6}
2 | 6 | {5,6,7}
2 | 7 | {6,7,1}
(12 rows)
我唯一需要做的就是用每个环的终点替换NULL
,这是最后一个值。现在,与lag
and lead
window functions, there is a last_value
function一起,这将是理想的。但是,这些不能嵌套:
SELECT ring, part, array[
lag(part, 1, last_value(part) over (partition by ring)) over (partition by ring),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
ERROR: window function calls cannot be nested
LINE 2: lag(part, 1, last_value(part) over (partition by ring)) ...
更新即可。感谢@ Justin建议使用coalesce
来避免嵌套窗口函数。此外,许多人已经指出,第一个/最后一个值需要在环序列上显式order by
,对于此示例恰好是part
。所以稍微随机化输入数据:
create table rp (ring int, part int);
insert into rp(ring, part) select 1, generate_series(1, 5) order by random();
insert into rp(ring, part) select 2, generate_series(1, 7) order by random();
答案 0 :(得分:3)
查询:
<强> SQLFIDDLEExample 强>
SELECT ring, part, array[
coalesce(lag(part, 1, NULL) over (partition by ring),
max(part) over (partition by ring)),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
结果:
| RING | PART | NEIGHBOURS |
|------|------|------------|
| 1 | 1 | 5,1,2 |
| 1 | 2 | 1,2,3 |
| 1 | 3 | 2,3,4 |
| 1 | 4 | 3,4,5 |
| 1 | 5 | 4,5,1 |
| 2 | 1 | 7,1,2 |
| 2 | 2 | 1,2,3 |
| 2 | 3 | 2,3,4 |
| 2 | 4 | 3,4,5 |
| 2 | 5 | 4,5,6 |
| 2 | 6 | 5,6,7 |
| 2 | 7 | 6,7,1 |
答案 1 :(得分:3)
COALESCE
之类的@Justin provided。使用first_value()
/ last_value()
您需要在窗口定义或订单中添加 ORDER BY
子句未定义。你刚才在这个例子中很幸运,因为在创建虚拟表之后,这些行恰好按顺序排列
添加ORDER BY
后,默认窗口框架将在当前行结束,您需要特殊情况last_value()
调用 - 或者在窗口框架中恢复排序顺序在我的第一个例子中证明了这一点。
多次重复使用窗口定义时,明确的 WINDOW
子句会大大简化语法:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,first_value(part) OVER (PARTITION BY ring ORDER BY part DESC))
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring ORDER BY part);
更好,重复使用相同的窗口定义,因此Postgres可以在单次扫描中计算所有值。为此,我们需要定义自定义窗口框架:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,last_value(part) OVER w)
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring
ORDER BY part
RANGE BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
ORDER BY 1,2;
您甚至可以调整每个窗口函数调用的帧定义:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,last_value(part) OVER (w RANGE BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING))
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring ORDER BY part)
ORDER BY 1,2;
对于有许多零件的戒指,可能会更快。你必须测试。
SQL Fiddle使用改进的测试用例展示了所有三个。考虑查询计划。
有关窗框定义的更多信息: