Presto中的组内模式等效

时间:2019-04-12 13:02:03

标签: sql aggregate-functions presto

在Postgres中,以下查询为每个客户输出最常购买的奶酪:

SELECT
    customer,
    MODE() WITHIN GROUP (ORDER BY "subcategory") AS "fav_cheese"
FROM dft
WHERE category = 'CHEESE'
GROUP BY
    customer

这将返回:

customer   fav_cheese
       1      cheddar    # customer1's most-frequently-purchased cheese is cheddar
       2         blue    # customer2's most-frequently-purchased cheese is blue
       3     shredded    # customer3's most-frequently-purchased cheese is shredded

如何在Presto中实现相同的输出?

到目前为止,我尝试了不同的方法,但均未成功。

2 个答案:

答案 0 :(得分:2)

作为一种解决方法,您可以使用直方图方法:

SELECT customer, 
MAP_KEYS(hist)[
    ARRAY_POSITION(MAP_VALUES(hist), ARRAY_MAX(MAP_VALUES(hist)))
] as fav_cheese 
FROM (
   SELECT customer, histogram(subcategory) as hist
   FROM dft
   WHERE category = 'CHEESE'
   GROUP BY customer
) as f

答案 1 :(得分:1)

您可以使用窗口功能:

SELECT customer, subcategory AS fav_cheese
FROM (SELECT customer, category, subcategory, COUNT(*) as cnt,
             ROW_NUMBER() OVER (PARTITION BY customer ORDER BY COUNT(*) DESC) as seqnum
      FROM dft
      WHERE category = 'CHEESE'
      GROUP BY customer, category, subcategory
     ) t
WHERE seqnum = 1;