如何优化以下查询:
我有两张桌子,' calendar_table'和'消费',我在这里用这个查询来计算每年的每月消费量。
日历表包含2005年至2009年的日,月和年,并且消费表已记录每月帐单周期的消费数据。此查询将计算每个帐单的天数,并使用该帐户查找每个月的消费量。
SELECT id,
date_from as bill_start_date,
theYear as Year,
MONTHNAME(STR_TO_DATE(theMonth, '%m')) as month,
sum(DaysOnBill),
TotalDaysInTheMonth,
sum(perDayConsumption * DaysOnBill) as EstimatedConsumption
FROM
(
SELECT
id,
date_from,
theYear,
theMonth, # use theMonth for displaying the month as a number
COUNT(*) AS DaysOnBill,
TotalDaysInTheMonth,
perDayConsumption
FROM
(
SELECT
c.id,
c.date_from as date_from,
ct.dt,
y AS theYear,
month AS theMonth,
DAY(LAST_DAY(ct.dt)) as TotalDaysInTheMonth,
perDayConsumption
FROM
consumption AS c
INNER JOIN
calendar_table AS ct
ON ct.dt >= c.date_from
AND ct.dt<= c.date_to
) AS allDates
GROUP BY
id,
date_from,
theYear,
theMonth ) AS estimates
GROUP BY
id,
theYear,
theMonth;
大约需要1000秒才能完成大约100万条记录。可以做些什么来加快速度吗?。
答案 0 :(得分:3)
该查询有点可疑,假装首先进行一次分组,然后再与另一种分组,实际上并非如此。
首先,该法案一直在加入。然后我们按照账单加上月份和年份进行分组,从而获得月度数据视图。这可以在一次传递中完成,但查询首先加入,然后将结果用作聚合的派生表。最后再次采取结果,另一个&#34;另一个&#34;建立了一个组,它实际上与之前相同(账单加月和年)并且完成了一些伪聚合(例如sum(perDayConsumption * DaysOnBill),它与perDayConsumption * DaysOnBill相同,因为SUM仅在此处汇总一条记录)。
这可以写成:
SELECT
c.id,
c.date_from as bill_start_date,
ct.y AS Year,
MONTHNAME(STR_TO_DATE(ct.month, '%m')) as month,
COUNT(*) AS DaysOnBill,
DAY(LAST_DAY(ct.dt)) as TotalDaysInTheMonth,
SUM(c.perDayConsumption) as EstimatedConsumption
FROM consumption AS c
INNER JOIN calendar_table AS ct ON ct.dt BETWEEN c.date_from AND c.date_to
GROUP BY
c.id,
ct.y,
ct.month;
我不知道这是否会更快,或者MySQL的优化程序是否无法查看您的查询本身并将其归结为此无论如何。