Question

我想优化此查询，

select  location_id, dept_id,
        round(sum(sales),0), sum(qty),
        count(distinct tran_id),
        now()
    from  tran_sales
    where  tran_date <= '2016-12-24'
    group by  location_id, dept_id;

目前此查询平均运行98秒（查询耗时97.4096秒。）在Windows 10,64位操作系统，16 GB RAM。

这是表格细节供您参考。

    CREATE TABLE tran_sales (
    tran_date date NOT NULL,
    location_id int(11) NOT NULL,
    dept_id int(11) NOT NULL,
    item_id varchar(25) NOT NULL,
    tran_id int(11) NOT NULL,
    sales float DEFAULT NULL,
    qty int(11) DEFAULT NULL,
    update_datetime datetime NOT NULL,
    PRIMARY KEY (tran_date,location_id,dept_id,item_id,tran_id),
    KEY tran_date (tran_date)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

表tran_sales中的记录数： 1350万。

注意：即使我尝试过没有和使用此索引KEY tran_date (tran_date)。有和没有KEY tran_date (tran_date)

的平均时间是98秒

如果有帮助，请建议如何通过更改查询或更改my.ini的某些默认设置来加速结果。感谢。

更新表中的最小日期为：2016-07-01，表中的最大日期为：2017-07-25

Answer 1

到目前为止，所有建议都没有多大帮助，因为......

覆盖指数：这只是比表略小，所以它稍快一些。
KEY(tran_date) - 浪费;最好使用以tran_date开头的PK。
PARTITIONing - 不。这可能会慢一些。
删除tran_date（或以其他方式重新安排PK） - 这会受到伤害。过滤（WHERE）位于tran_date;通常最好先拥有。
那么，为什么COUNT(*)禁食？好吧，首先看一下EXPLAIN。它会显示它使用KEY(tran_date) 代替扫描表格。扫描的数据更少，因此更快。

真正的问题是你需要扫描数百万行，需要花费时间来触摸数百万行。

如何加快速度？创建并维护Summary table。然后查询该表（具有数千行）而不是原始表（数百万行）。总计数为SUM(counts);总和为SUM(sums);平均值为SUM(sums)/SUM(counts)等。

Answer 2

对于此查询：

select location_id, dept_id,
       round(sum(sales), 0), sum(qty), count(distinct tran_id),
       now()
from tran_sales
where tran_date <= '2016-12-24'
group by location_id, dept_id;

你无能为力。一次尝试将是覆盖指数：(tran_date, location_id, dept_id, sales, qty)，但我认为这不会有太大帮助。

针对大表的Mysql Optimization建议

2 个答案: