postgres:在每个组中获得前n个值

时间:2013-09-05 09:28:02

标签: sql postgresql group-by sql-order-by greatest-n-per-group

我有一个这样的简单表:

user    letter
--------------
1       A
1       A
1       B
1       B
1       B
1       C

2       A
2       B
2       B
2       C
2       C
2       C

我希望每个用户获得前两次'letter',如此

user    letter  rank(within user group)
--------------------
1       B       1
1       A       2

2       C       1
2       B       2

甚至更好:折叠成列

user    1st-most-occurrence  2nd-most-occurrence
1       B                   A
2       C                   B

我如何在postgres中完成此任务?

3 个答案:

答案 0 :(得分:2)

这样的事情:

select *
from (
    select userid, 
           letter, 
           dense_rank() over (partition by userid order by count(*) desc) as rnk
    from letters
    group by userid, letter
) t
where rnk <= 2
order by userid, rnk;

请注意,我将user替换为userid,因为对列使用保留字是一种坏习惯。

这是一个SQLFiddle:http://sqlfiddle.com/#!12/ec3ec/1

答案 1 :(得分:1)

with cte as (
    select 
        t.user_id, t.letter,
        row_number() over(partition by t.user_id order by count(*) desc) as row_num
    from Table1 as t
    group by t.user_id, t.letter
)
select
    c.user_id,
    max(case when c.row_num = 1 then c.letter end) as "1st-most-occurance",
    max(case when c.row_num = 2 then c.letter end) as "2st-most-occurance"
from cte as c
where c.row_num <= 2
group by c.user_id

=> sql fiddle demo

答案 2 :(得分:0)

需要的功能:

CREATE OR REPLACE FUNCTION sortCountLimitOffset(anyarray, int, int)
  RETURNS anyarray AS 'select array_agg(x) from (select x from (select unnest($1) as x) as t group by x order by count(*) desc offset $2 limit $3) t;'
  LANGUAGE sql VOLATILE
  COST 100;

解决方案1:(返回所有连接成字符串的字母)

select
    usr,
    array_to_string(sortCountLimitOffset(array_agg(letter), 0, 5), ',')
from ttt
group by usr;

<强>输出:

 usr | array_to_string
-----+-----------------
   1 | B,A,C
   2 | C,B,A
(2 Zeilen)

解决方案2:(在单独的列中返回每个第n个字母)

select
    usr,
    array_to_string(sortCountLimitOffset(array_agg(letter), 0, 1), ',') letter1,
    array_to_string(sortCountLimitOffset(array_agg(letter), 1, 1), ',') letter2,
    array_to_string(sortCountLimitOffset(array_agg(letter), 2, 1), ',') letter3,
    array_to_string(sortCountLimitOffset(array_agg(letter), 3, 1), ',') letter4,
    array_to_string(sortCountLimitOffset(array_agg(letter), 4, 1), ',') letter5
from ttt
group by usr;

<强>输出:

 usr | letter1 | letter2 | letter3 | letter4 | letter5
-----+---------+---------+---------+---------+---------
   1 | B       | A       | C       |         |
   2 | C       | B       | A       |         |
(2 Zeilen)

也可以从调用函数的函数内联SELECT。但是现在的方式,重用和维护代码更容易。