如何使用SQL窗口函数计算聚合的百分比

时间:2011-12-15 04:33:50

标签: sql postgresql aggregate-functions window-functions greenplum

我需要计算表格中各种维度的百分比。我想通过使用窗口函数来计算分母来简化事情,但是我遇到了问题,因为分子也必须是聚合。

举个简单的例子,请参考下表:

create temp table test (d1 text, d2 text, v numeric);
insert into test values ('a','x',5), ('a','y',5), ('a','y',10), ('b','x',20);

如果我只想计算d1中每一行的份额,那么窗口函数可以正常工作:

select d1, d2, v/sum(v) over (partition by d1)
from test;

"b";"x";1.00
"a";"x";0.25
"a";"y";0.25
"a";"y";0.50

但是,我需要做的是计算d1中d2总和的总份额。我正在寻找的输出是:

"b";"x";1.00
"a";"x";0.25
"a";"y";0.75

所以我试试这个:

select d1, d2, sum(v)/sum(v) over (partition by d1)
from test
group by d1, d2;

然而,现在我收到一个错误:

ERROR:  column "test.v" must appear in the GROUP BY clause or be used in an aggregate function

我假设这是因为它抱怨在分组子句中没有考虑窗口函数,但是无论如何窗口函数都不能放在分组子句中。

这是使用Greenplum 4.1,它是Postgresql 8.4的一个分支,并共享相同的窗口函数。请注意,Greenplum无法执行相关子查询。

2 个答案:

答案 0 :(得分:24)

我认为你真正想要的是:

SELECT d1, d2, sum(v)/sum(sum(v)) OVER (PARTITION BY d1) AS share
FROM   test
GROUP  BY d1, d2;

生成请求的结果。

在聚合函数之后应用窗口函数。 sum()中的sum(sum(v))外部是此示例中的窗口函数,附加到OVER ...子句,而内部sum()是聚合。

实际上与:

相同
WITH x AS (
    SELECT d1, d2, sum(v) AS sv
    FROM   test
    GROUP  BY d1, d2
    )
SELECT d1, d2, sv/sum(sv) OVER (PARTITION BY d1) AS share
FROM   x;

或(没有CTE):

SELECT d1, d2, sv/sum(sv) OVER (PARTITION BY d1) AS share
FROM   (
    SELECT d1, d2, sum(v) AS sv
    FROM   test
    GROUP  BY d1, d2
    ) x;

或@ Mu的变体。

除此之外:Greenplum引入了4.2版的相关子查询。 See release notes.

答案 1 :(得分:2)

您是否需要使用窗口功能完成所有操作?听起来你只需要按d1d2对结果进行分组,然后将总和加起来:

select d1, d2, sum(p)
from (
    select d1, d2, v/sum(v) over (partition by d1) as p
    from test
) as dt
group by d1, d2

这给了我这个:

 d1 | d2 |          sum           
----+----+------------------------
 a  | x  | 0.25000000000000000000
 a  | y  | 0.75000000000000000000
 b  | x  | 1.00000000000000000000