一对多查询,同时基于不同的主键进行限制

时间:2014-05-20 22:01:02

标签: java sql postgresql left-join sql-limit

我有一张这样的表:

create table images (
    image_id serial primary key,
    user_id int references users(user_id),
    date_created timestamp with time zone
);

然后我有一个标签表,用于图片可以有的标签:

create table images_tags (
    images_tag_id serial primary key,
    image_id int references images(image_id),
    tag_id int references tags(tag_id)       
);

为了得到我想要的结果,我运行这样的查询:

select image_id,user_id,tag_id from images left join images_tags using(image_id)
where (?=-1 or user_id=?)
and (?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
order by date_created desc limit 100;

问题是,我想基于唯一image_id的数量来限制,因为我的输出将如下所示:

{"images":[
    {"image_id":1, "tag_ids":[1, 2, 3]},
    ....
]}

注意我如何将tag_id分组到数组中以进行输出,即使SQL为每个tag_idimage_id组合返回一行。

因此,当我说limit 100时,我希望它适用于100个唯一的image_id

2 个答案:

答案 0 :(得分:2)

也许你应该在每一行上放一张图片?如果可行,您可以这样做:

select image_id, user_id, string_agg(cast(tag_id as varchar(2000)), ',') as tags
from images left join
     images_tags
     using (image_id)
where (?=-1 or user_id=?) and
      (?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
group by image_id, user_id
order by date_created desc
limit 100;

如果这不起作用,请使用CTE:

with cte as (
      select image_id, user_id, tag_id,
             dense_rank() over (order by date_created desc) as seqnum
      from images left join
           images_tags
           using (image_id)
      where (?=-1 or user_id=?) and
            (?=-1 or tag_id in (?, ?, ?, ?)) --have up to 4 tag_ids to search for
    )
select *
from cte
where seqnum <= 100
order by seqnum;

答案 1 :(得分:1)

首先选择100个合格图像,然后加入images_tags 使用EXISTS semi-join来满足images_tags上的条件,并注意使括号正确。

SELECT i.*, t.tag_id
FROM  (
   SELECT i.image_id, i.user_id
   FROM   images i
   WHERE (? = -1 OR i.user_id = ?)
   AND   (? = -1 OR EXISTS (
      SELECT 1
      FROM   images_tags t
      WHERE  t.image_id = i.image_id
      AND    t.tag_id IN (?, ?, ?, ?)
      ))
   ORDER  BY i.date_created DESC
   LIMIT  100
   ) i
LEFT   JOIN images_tags t
            ON t.image_id = i.image_id
           AND (? = -1 OR t.tag_id in (?, ?, ?, ?)) -- repeat condition

这应该比具有窗函数和CTE的解决方案更快 使用EXPLAIN ANLAYZE测试性能。一如既往地运行几次来预热缓存。