在SQL中汇总和计算百分比的优雅方法?

时间:2019-02-20 17:35:11

标签: sql aggregate presto

我有一个user, product, count表,该表告诉用户购买了什么以及购买了多少次(“计数”)。

我想知道用户的“平均费用”是多少,即每种产品代表用户的百分比。

例如

user1,fruits,4
user1,water,2
user2,fruits,3
user2,food,9

所以我会得到

user1,fruits,0.6666  // = 4 / 4+2
user1,water,0.3333  // = 2 / 4+2
user2,fruits,0.25  // = 3 / 3+9
user2,food,0.75  // = 9 / 3+9

及以后

fruits,0.45  // = 0.666+0.25 / 2
water,0.16  // = 0.33/2
food,0.38  // = 0.75/2

我用过

select t1.user as user, t1.product as product, max(t1.c) / max(t2.c) as ratio

from (
  select user, product, count(*) as c
  from table
  group by user, product
) t1
join (
  select user, count(*) as c
  from table
  group by user
) t2
on t1.user=t2.user
group by user, product

获取第一个表,然后获取该表上的select product, avg(ratio) ... group by product

一切正常,但我想知道是否有更有效/更好的方法?

3 个答案:

答案 0 :(得分:1)

我总是使用window functions来计算百分比:

参考:http://www.mysqltutorial.org/mysql-window-functions/

示例:http://sqlfiddle.com/#!17/66373/6

SELECT
  user,
  product,
  c,
  sum(c) over(partition by usr) sc,
  c / sum(c) over(partition by usr) per
FROM (
  SELECT usr, product, count(*) c
   FROM tablex
   GROUP BY usr, product
) t


CREATE TABLE tablex (
  usr varchar(32),
  product varchar(32)
);

INSERT INTO tablex VALUES ('a', 'x');
INSERT INTO tablex VALUES ('a', 'y');
INSERT INTO tablex VALUES ('a', 'y');
INSERT INTO tablex VALUES ('a', 'y');
INSERT INTO tablex VALUES ('a', 'z');
INSERT INTO tablex VALUES ('a', 'z');
INSERT INTO tablex VALUES ('a', 'z');
INSERT INTO tablex VALUES ('a', 'z');
INSERT INTO tablex VALUES ('a', 'z');

INSERT INTO tablex VALUES ('b', 'x');
INSERT INTO tablex VALUES ('b', 'x');
INSERT INTO tablex VALUES ('b', 'x');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'y');
INSERT INTO tablex VALUES ('b', 'z');
INSERT INTO tablex VALUES ('b', 'z');
INSERT INTO tablex VALUES ('b', 'z');
INSERT INTO tablex VALUES ('b', 'z');
INSERT INTO tablex VALUES ('b', 'z');

答案 1 :(得分:0)

您可以使用此代码并检查执行计划,我确信性能已经得到改善。

 select user, product, CAST(count(*) AS decimal(18,4)) / (select count(*) 
               from table t2 where t2.user = t1.user) 
 from table t1
 group by user, product

答案 2 :(得分:0)

我会这样写:

select user, product, count(*) as c,
       count(*) * 1.0 / sum(count(*)) over (partition by user) as ratio
from table
group by user, product;