我有一个表,该表具有3 GB数据(此表将不断增加),并且我需要显示总销售额,排名靠前的产品和排名靠前的产品(该列中出现的次数最多)。 以下是给我上述结果的查询:
select t.category,
sum(t.sale) sales,
(select product
from demo
where category = t.category
group by product
order by count(*) desc
limit 1) top_product
from demo t
group by t.category
以上查询大约需要2分钟25秒。我找不到任何优化方法。有人可以推荐其他方式吗?
示例表:
category product sale
C1 P1 10
C2 P2 12
C3 P1 14
C1 P2 15
C1 P1 02
C2 P2 10
C2 P3 22
C3 P1 01
C3 P2 27
C3 P3 02
输出:
category Top product Total sales
C1 P1 27
C2 P2 44
C3 P1 44
答案 0 :(得分:1)
您的查询可以这样写:
SELECT g1.category, g1.sum_sale, g2.product
FROM (
SELECT category, SUM(sale) AS sum_sale
FROM demo
GROUP BY category
) AS g1
INNER JOIN (
SELECT category, product, COUNT(*) AS product_count
FROM demo
GROUP BY category, product
) AS g2 ON g1.category = g2.category
INNER JOIN (
SELECT category, MAX(product_count) AS product_count_max
FROM (
SELECT category, product, COUNT(*) AS product_count
FROM demo
GROUP BY category, product
) AS x
GROUP BY category
) AS g3 ON g2.category = g3.category AND g2.product_count = g3.product_count_max
基本上,它尝试查找每个类别的最大计数(*),然后从中计算出乘积。可以从适当的索引中受益。
答案 1 :(得分:1)
仅MySQL的骇客解决方案结合使用GROUP_CONCAT
和嵌套的SUBSTRING_INDEX
函数来获取有序逗号分隔字符串中的第一个元素。
这不是理想的方法;但它会减少所需的子查询数量,并且可能对您的特殊情况有效。
您还需要使用SET SESSION group_concat_max_len = @@max_allowed_packet;
。
我们基本上确定产品和类别组合的销售额和发生次数。然后将此结果集用作Derived Table,我们使用Group_concat()
hack来确定类别中具有最大数量的产品。
SET SESSION group_concat_max_len = @@max_allowed_packet;
SELECT
dt.category,
SUM(dt.sale_per_category_product) AS total_sales,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
GROUP_CONCAT(dt.product ORDER BY dt.product_count_per_category DESC)
, ','
, 1
)
, ','
, -1
) AS top_product
FROM
(
SELECT
category,
product,
SUM(sale) AS sale_per_category_product,
COUNT(*) AS product_count_per_category
FROM demo
GROUP BY category, product
) AS dt
GROUP BY dt.category
模式(MySQL v5.7)
| category | total_sales | top_product |
| -------- | ----------- | ------------|
| C1 | 27 | P1 |
| C2 | 44 | P2 |
| C3 | 44 | P1 |