PostgreSQL:选择满足限制条件的随机记录

时间:2016-03-27 15:23:34

标签: sql postgresql common-table-expression

假设我有一个带有项目记录的Item表,每个项目可以属于一个或多个类别。每个类别中都有一个或多个项目

如何选择符合条件的唯一项的随机列表,例如A类中的5项,B类中的3项,C类中的4项等,并保留订单类别即A - > B - > ç

查询的sort_order和每个类别item_count存储在另一个表中。

项目表相当大~100万行,满足条件的项目可能有相当大的空白。

2 个答案:

答案 0 :(得分:1)

您可以尝试这样的事情:

SELECT item_id FROM (
    ((SELECT t.category,t.item_id from items t where t.category ='A' order by random() limit 5)
    UNION
    (SELECT t.category,t.item_id from items t where t.category ='B' order by random() limit 3)
    UNION
    (SELECT t.category,t.item_id from items t where t.category ='C' order by random() limit 4))
ORDER BY category

我不能保证你会很快,但它应该有用。

答案 1 :(得分:1)

我倾向于使用join执行此操作,如下所示:

select i.*
from (select i.*, row_number() over (partition by category order by random()) as seqnum
      from items i
     ) i join
     (select 'A' as category, 5 as num union all
      select 'B' as category, 3 as num union all
      select 'C' as category, 4 as num 
     ) l
     on i.category = l.category
where i.seqnum <= l.num;

但是,这并未解决唯一项目的问题。因此,同一项目可能会多次出现在列表中。假设此请求有足够的项目,我首先会为每个项目选择一个随机类别,并遵循相同的逻辑:

select i.*
from (select i.itemid, min(category) as category,
             row_number() over (partition by min(category)
                                order by random()
                               ) as seqnum
      from items i
      group by i.itemid
     ) i join
     (select 'A' as category, 5 as num union all
      select 'B' as category, 3 as num union all
      select 'C' as category, 4 as num 
     ) l
     on i.category = l.category
where i.seqnum <= l.num;

min()的使用对于每个项目获得一个类别是一种破解。