查找每个类别的前N个最频繁的类别和前N个最频繁的子类别

时间:2018-12-10 10:09:14

标签: sql postgresql

我正在尝试进行单个查询以检索:

顶部例如汽车列表中3个最受欢迎的品牌。我想针对前三大品牌中的每一个检索前五名最受欢迎的车型。

我尝试使用排名/分区策略和distinct ON策略,但似乎无法弄清楚如何在两个查询中发挥作用。

以下是一些示例数据:http://sqlfiddle.com/#!15/1e81d5/1

在给出样本数据(顺序不重要)的情况下,我希望从排名查询中获得这样的输出:

brand       car_mode    count
'Audi'      'A4'        3
'Audi'      'A1'        3
'Audi'      'Q7'        2
'Audi'      'Q5'        2
'Audi'      'A3'        2
'VW'        'Passat'    3
'VW'        'Beetle'    3
'VW'        'Caravelle' 2
'VW'        'Golf'      2
'VW'        'Fox'       2
'Volvo'     'V70'       3
'Volvo'     'V40'       3
'Volvo'     'S60'       2
'Volvo'     'XC70'      2
'Volvo'     'V50'       2

3 个答案:

答案 0 :(得分:1)

结果我可以按照评论中的建议使用LATERAL连接。谢谢。

SELECT brand, car_model, the_count
FROM
  (
    SELECT brand FROM cars GROUP BY brand ORDER BY COUNT(*) DESC LIMIT 3 
  ) o1
INNER JOIN LATERAL
  (
    SELECT car_model, count(*) as the_count
    FROM cars
    WHERE brand = o1.brand
    GROUP BY brand, car_model
    ORDER BY count(*) DESC LIMIT 5
  ) o2 ON true;

http://sqlfiddle.com/#!15/1e81d5/9

答案 1 :(得分:0)

您可以尝试使用cte和窗口功能row_number()

with cte as
(
select brand,car_model,count(*) as cnt from cars group by brand,car_model
 ) , cte2 as
 (
     select * ,row_number() over(partition by brand order by cnt desc) rn from cte
 )
select brand,car_model,cnt from cte2 where rn<=5

demo link

答案 2 :(得分:0)

您可以为此使用窗口功能:

select brand, car_model, cnt_car
from (select c.*, dense_rank() over (order by cnt_brand, brand) as seqnum_b
      from (select brand, car_model, count(*) as cnt_car,
                   row_number() over (partition by brand order by count(*) desc) as seqnum_bc,
                   sum(count(*)) over (partition by brand) as cnt_brand
            from cars c
            group by brand, car_model
           ) c
     ) c
 where seqnum_bc <= 5 and seqnum_b <= 3
 order by cnt_brand desc, brand, cnt desc;

如果您知道每个品牌(或至少每个顶级品牌)至少有五辆汽车,则可以将查询简化为:

select brand, car_model, cnt_car
from (select brand, car_model, count(*) as cnt_car,
              row_number() over (partition by brand order by count(*) desc) as seqnum_bc,
              sum(count(*)) over (partition by brand) as cnt_brand
      from cars c
      group by brand, car_model
     ) c
where seqnum_bc <= 5 
order by cnt_brand desc, brand, cnt desc
limit 15