查找每个位置的前3位用户

时间:2013-04-14 02:54:20

标签: sql postgresql greatest-n-per-group window-functions top-n

我是SQL的新手,处理一些练习问题。我有一个示例Twitter数据库,我试图根据关注者的数量找到每个位置的前3位用户。

以下是我正在使用的表格:

id_follower_location

        id       | followers | location 
-----------------+-----------+----------
 id28929238      |         1 | Toronto
 id289292338     |         1 | California
 id2892923838    |         2 | Rome
 .
 .

locations

           location       
----------------------
 Bay Area, California
 London
 Nashville, TN
.
.

我已经能够通过以下方式找到“顶级”用户:

create view top1id as 
  select location, 
    (select id_followers_location.id from id_followers_location 
      where id_followers_location.location = locations.location 
      order by followers desc limit 1
    ) as id 
  from locations;

create view top1 as 
  select location, id, 
    (select followers from id_followers_location 
      where id_followers_location.id = top1id.id
    ) as followers 
  from top1id;

我能够想出的唯一方法就是找出“Top 1st”,“Top 2nd”,“Top 3rd”,然后使用union来组合它。这是正确/唯一的方式吗?或者有更好的方法吗?

2 个答案:

答案 0 :(得分:4)

前n

使用rank(),您将获得至少3行行(如果存在的话更少,则会更少)。如果前三个等级之间存在联系,则可能会返回更多行。

如果您希望每个位置正好3 行(如果存在的话更少,则更少),您必须打破关系。一种方法是使用row_number()代替rank()

SELECT *
FROM (
   SELECT id, location
         ,row_number() OVER (PARTITION BY location ORDER BY followers DESC) AS rn
   FROM   id_follower_location
   ) r
WHERE  rn <= 3
ORDER  BY location, rn;

您可能希望将ORDER BY添加到外部查询以保证已排序的输出 如果有三个以上的有效候选人,你可以从领带中随意挑选 在ORDER BY子句中添加更多OVER项以打破关系。

前1名

至于获取 top 1 行的查询:PostgreSQL中有一个更简单,更快捷的方法:

SELECT DISTINCT ON (location)
       id, location           -- add additional columns freely
FROM   id_follower_location
ORDER  BY location, followers DESC;

这个密切相关的答案中有关此查询技术的详细信息:

答案 1 :(得分:2)

您可以使用窗口函数执行此操作:http://www.postgresql.org/docs/9.1/static/tutorial-window.html

例如(未测试可能需要轻微的语法修复):

SELECT follower_ranks.id, follower_ranks.location 
FROM (
    SELECT id, location, 
      RANK() OVER (PARTITION BY location ORDER BY followers DESC) 
    FROM id_follower_location
) follower_ranks 
WHERE follower_ranks.rank <= 3;