我是SQL的新手,处理一些练习问题。我有一个示例Twitter数据库,我试图根据关注者的数量找到每个位置的前3位用户。
以下是我正在使用的表格:
id_follower_location
id | followers | location
-----------------+-----------+----------
id28929238 | 1 | Toronto
id289292338 | 1 | California
id2892923838 | 2 | Rome
.
.
locations
location
----------------------
Bay Area, California
London
Nashville, TN
.
.
我已经能够通过以下方式找到“顶级”用户:
create view top1id as
select location,
(select id_followers_location.id from id_followers_location
where id_followers_location.location = locations.location
order by followers desc limit 1
) as id
from locations;
create view top1 as
select location, id,
(select followers from id_followers_location
where id_followers_location.id = top1id.id
) as followers
from top1id;
我能够想出的唯一方法就是找出“Top 1st”,“Top 2nd”,“Top 3rd”,然后使用union
来组合它。这是正确/唯一的方式吗?或者有更好的方法吗?
答案 0 :(得分:4)
使用rank()
,您将获得至少3行行(如果存在的话更少,则会更少)。如果前三个等级之间存在联系,则可能会返回更多行。
如果您希望每个位置正好3 行(如果存在的话更少,则更少),您必须打破关系。一种方法是使用row_number()
代替rank()
。
SELECT *
FROM (
SELECT id, location
,row_number() OVER (PARTITION BY location ORDER BY followers DESC) AS rn
FROM id_follower_location
) r
WHERE rn <= 3
ORDER BY location, rn;
您可能希望将ORDER BY
添加到外部查询以保证已排序的输出
如果有三个以上的有效候选人,你可以从领带中随意挑选
在ORDER BY
子句中添加更多OVER
项以打破关系。
至于获取 top 1 行的查询:PostgreSQL中有一个多更简单,更快捷的方法:
SELECT DISTINCT ON (location)
id, location -- add additional columns freely
FROM id_follower_location
ORDER BY location, followers DESC;
这个密切相关的答案中有关此查询技术的详细信息:
答案 1 :(得分:2)
您可以使用窗口函数执行此操作:http://www.postgresql.org/docs/9.1/static/tutorial-window.html
例如(未测试可能需要轻微的语法修复):
SELECT follower_ranks.id, follower_ranks.location
FROM (
SELECT id, location,
RANK() OVER (PARTITION BY location ORDER BY followers DESC)
FROM id_follower_location
) follower_ranks
WHERE follower_ranks.rank <= 3;