具有限制且没有全表扫描的聚合函数

时间:2015-11-24 14:23:58

标签: sql postgresql

我有photo表:

create table photo(
    id integer,
    ...
    user_id integer,
    created_at date
);

我希望得到与以下相同的结果:

select 
    json_agg(photo), 
    created_at,
    id_user
from photo
group by created_at, id_user
order by created_at desc, id_user 
limit 5;

但要避免photo上的全表扫描。

有可能吗?我在考虑递归CTE,但我无法构建它。

3 个答案:

答案 0 :(得分:3)

假设您在photo(id_user, created_at)上有索引,那么您可以使用子查询选择所需的五行。然后使用连接或相关子查询来获取其余信息:

select cu.created_at, cu.id_user,
       (select json_agg(p.photo)
        from photo p
        where cu.created_at = p.created_at and cu.id_user = p.id_user
       )
from (select distinct created_at, id_user
      from photo p
      order by created_at desc, id_user
      limit 5
     ) cu
order by cu.created_at desc, cu.id_user ;

答案 1 :(得分:1)

不递归,您可以尝试使用单个CTE查看是否在没有完全扫描的情况下获得TOP 5

WITH cte as (
  SELECT DISTINCT created_at, id_user
  FROM photo
  ORDER BY created_at DESC, id_user
  LIMIT 5
)
SELECT p.created_at, p.id_user, json_agg(p.photo)
FROM photo p
JOIN cte c
  ON p.created_at = c.created_at 
 AND p.id_user = c.id_user
GROUP BY p.created_at, p.id_user
ORDER BY p.created_at DESC, p.id_user

答案 2 :(得分:0)

如果created_at上有索引,并且可以假设过去24小时内(或48或其他)至少有5张照片,则可以避免完整扫描:

select 
    json_agg(photo), 
    created_at,
    id_user
from photo
where created_at > (select max(created_at) from photo) - interval '24 hours'
group by created_at, id_user
order by created_at desc, id_user 
limit 5;

间隔越短,扫描越短。