我有以下2个SQL查询,它们的95%相同,但性能却大不相同。
SQL查询1(< 0,1s):
SELECT CONCAT(a.`report_year`, '-', a. `report_month`) as `yearmonth`,
AVG(a.cost_leasing/b.rate*IF(`report_year`=2016,0.73235,
IF(`report_year`=2017,0.83430,1))) as average,
'current' as `type`
FROM `vehicles` as a, `exchange_rates` as b
WHERE cid='3' AND
STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'),
'%Y-%m-%d') >= '2016-01-01' AND
LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`,
'-01'), '%Y-%m-%d')) <= '2017-06-30' AND
`country` IN ('XX','UK') AND
a.currency = b.currency AND
b.`year` = `report_year` AND
fxid=2
GROUP BY `yearmonth`
ORDER BY `yearmonth`;
解释查询1:
1 SIMPLE a ref new_selectors,... new_cost_leasing 4 const 10812 Using where; Using index; Using temporary; Using f...
1 SIMPLE b ref PRIMARY,date,fxid fxid 19 const,c1682fleet.a.report_year,c1682fleet.a.curren... 196 Using where; Using index
SQL查询2(&gt; 3s):
SELECT CONCAT(c.`report_year`, '-', c.`report_month`) as `yearmonth`,
AVG(c.cost_leasing/d.rate*IF(`report_year`=2016,0.73235,
IF(`report_year`=2017,0.83430,1))),
'baseline'
FROM `kpis` as c, `exchange_rates` as d
WHERE cid='3' AND
STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'),
'%Y-%m-%d') >= '2016-01-01' AND
LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`,
'-01'), '%Y-%m-%d')) <= '2017-06-30' AND
`country` IN ('XX','UK') AND
c.kid=1 AND
c.currency = d.currency AND
d.`year` = `report_year` AND
fxid=2
GROUP BY `yearmonth`
ORDER BY `yearmonth`;
解释查询2:
1 SIMPLE c ref oem_group,... cost_leasing 8 const,const 30038 Using where; Using index; Using temporary; Using f...
1 SIMPLE d ref PRIMARY,date,fxid fxid 19 const,c1682fleet.c.report_year,c1682fleet.c.curren... 196 Using where; Using index
显示来自车辆的指数:
vehicles 0 PRIMARY 1 vid A 146068 BTREE
vehicles 1 new_cost_leasing 1 cid A 12 BTREE
vehicles 1 new_cost_leasing 2 cost_leasing A 4564 BTREE
vehicles 1 new_cost_leasing 3 currency A 5216 BTREE
vehicles 1 new_cost_leasing 4 report_month A 24344 BTREE
vehicles 1 new_cost_leasing 5 report_year A 29213 BTREE
vehicles 1 new_cost_leasing 6 country A 36517 BTREE
vehicles 1 new_cost_leasing 7 supplier A 29213 BTREE
vehicles 1 new_cost_leasing 8 jato_segment A 24344 BTREE
vehicles 1 new_cost_leasing 9 business_unit A 36517 BTREE
vehicles 1 new_cost_leasing 10 entity A 73034 BTREE
显示来自exchange_rates的索引:
exchange_rates 0 PRIMARY 1 fxid A 2 BTREE
exchange_rates 0 PRIMARY 2 currency A 160 BTREE
exchange_rates 0 PRIMARY 3 date A 569250 BTREE
exchange_rates 1 date 1 fxid A 2 BTREE
exchange_rates 1 date 2 date A 28462 BTREE
exchange_rates 1 date 3 currency A 569250 BTREE
exchange_rates 1 date 4 rate A 569250 BTREE
exchange_rates 1 fxid 1 fxid A 2 BTREE
exchange_rates 1 fxid 2 year A 114 BTREE
exchange_rates 1 fxid 3 currency A 2904 BTREE
exchange_rates 1 fxid 4 rate A 569250 BTREE
显示来自kpis的索引:
kpis 0 PRIMARY 1 vid A 60308 BTREE
kpis 1 cost_leasing 1 cid A 2 BTREE
kpis 1 cost_leasing 2 kid A 2 BTREE
kpis 1 cost_leasing 3 cost_leasing A 78 BTREE
kpis 1 cost_leasing 4 currency A 78 BTREE
kpis 1 cost_leasing 5 report_month A 1096 BTREE
kpis 1 cost_leasing 6 report_year A 3350 BTREE
kpis 1 cost_leasing 7 country A 1884 BTREE
kpis 1 cost_leasing 8 supplier A 4020 BTREE
kpis 1 cost_leasing 9 jato_segment A 3015 BTREE
kpis 1 cost_leasing 10 business_unit A 4307 BTREE
kpis 1 cost_leasing 11 entity A 6030 BTREE
kpis 1 avg_cost 1 cid A 2 BTREE
kpis 1 avg_cost 2 kid A 2 BTREE
kpis 1 avg_cost 3 country A 48 BTREE
kpis 1 avg_cost 4 report_year A 96 BTREE
kpis 1 avg_cost 5 currency A 96 BTREE
kpis 1 avg_cost 6 cost_leasing A 172 BTREE
问题: 我的问题是,为什么存在如此显着的性能差异(因子30),即使在查询2(孩子)中只有一个额外的标准,这甚至是索引的一部分。
任何人都知道如何优化查询2?
答案 0 :(得分:0)
我发现了问题:exchange_rates
列year
不是唯一的,vehicles
的选择只有kpis
的一半,但由于非唯一列year
的大基数exchange_rates
和kpis
的连接创建了一个超过200万个条目的临时集,这对于平均操作来说非常大。
解决方案:我没有使用year
,而是使用了唯一列date
并将条件更改为
`date` = MAKEDATE(`report_year`, 1)
答案 1 :(得分:0)
这些不是可以理解的:
STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), '%Y-%m-%d') >= '2016-01-01'
LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), '%Y-%m-%d')) <= '2017-06-30'
对于每行使用5个函数来连接和转换为日期,但只有2个日期常量被比较。如果您可以将其反转并将2个日期常量转换为适合未更改数据的某个数据,那么将节省大量精力。您不仅可以节省函数的计算工作量,还可以使用report_year
和report_month
上的索引。
我没有时间对此进行测试,而且我猜测所调用的列是整数,但我认为日期范围处理的一组更可靠的谓词将有助于两个查询。 e.g。
MySQL 5.6架构设置:
CREATE TABLE Table1
(`Report_Year` int, `Report_Month` int)
;
INSERT INTO Table1
(`Report_Year`, `Report_Month`)
VALUES
(2015, 1), (2015, 2), (2015, 3),
(2015, 4), (2015, 5), (2015, 6),
(2015, 7), (2015, 8), (2015, 9),
(2015, 10), (2015, 11), (2015, 12),
(2016, 1), (2016, 2), (2016, 3),
(2016, 4), (2016, 5), (2016, 6),
(2016, 7), (2016, 8), (2016, 9),
(2016, 10), (2016, 11), (2016, 12),
(2017, 1), (2017, 2), (2017, 3),
(2017, 4), (2017, 5), (2017, 6),
(2017, 7), (2017, 8), (2017, 9),
(2017, 10), (2017, 11), (2017, 12)
;
**查询**:
set @start := '2016-04-04';
set @end := '2017-01-30';
select *, @start, @end
from table1
where (
((year(@start) < year(@end)) AND report_year = year(@start) and report_month >= month(@start))
OR
((year(@start) < year(@end)) AND report_year > year(@start) and report_year < year(@end))
OR
((year(@start) <= year(@end)) AND report_year = year(@end) and report_month <= month(@end))
)
<强> [结果] 强>:
| Report_Year | Report_Month | @start | @end |
|-------------|--------------|------------|------------|
| 2016 | 4 | 2016-04-04 | 2017-01-30 |
| 2016 | 5 | 2016-04-04 | 2017-01-30 |
| 2016 | 6 | 2016-04-04 | 2017-01-30 |
| 2016 | 7 | 2016-04-04 | 2017-01-30 |
| 2016 | 8 | 2016-04-04 | 2017-01-30 |
| 2016 | 9 | 2016-04-04 | 2017-01-30 |
| 2016 | 10 | 2016-04-04 | 2017-01-30 |
| 2016 | 11 | 2016-04-04 | 2017-01-30 |
| 2016 | 12 | 2016-04-04 | 2017-01-30 |
| 2017 | 1 | 2016-04-04 | 2017-01-30 |
<强> [结果] 强>:
set @start := '2016-01-01';
set @end := '2016-06-30';
| Report_Year | Report_Month | @start | @end |
|-------------|--------------|------------|------------|
| 2016 | 1 | 2016-01-01 | 2016-06-30 |
| 2016 | 2 | 2016-01-01 | 2016-06-30 |
| 2016 | 3 | 2016-01-01 | 2016-06-30 |
| 2016 | 4 | 2016-01-01 | 2016-06-30 |
| 2016 | 5 | 2016-01-01 | 2016-06-30 |
| 2016 | 6 | 2016-01-01 | 2016-06-30 |
set @start := '2016-01-01';
set @end := '2017-06-30';
<强> [结果] 强>:
| Report_Year | Report_Month | @start | @end |
|-------------|--------------|------------|------------|
| 2016 | 1 | 2016-01-01 | 2017-06-30 |
| 2016 | 2 | 2016-01-01 | 2017-06-30 |
| 2016 | 3 | 2016-01-01 | 2017-06-30 |
| 2016 | 4 | 2016-01-01 | 2017-06-30 |
| 2016 | 5 | 2016-01-01 | 2017-06-30 |
| 2016 | 6 | 2016-01-01 | 2017-06-30 |
| 2016 | 7 | 2016-01-01 | 2017-06-30 |
| 2016 | 8 | 2016-01-01 | 2017-06-30 |
| 2016 | 9 | 2016-01-01 | 2017-06-30 |
| 2016 | 10 | 2016-01-01 | 2017-06-30 |
| 2016 | 11 | 2016-01-01 | 2017-06-30 |
| 2016 | 12 | 2016-01-01 | 2017-06-30 |
| 2017 | 1 | 2016-01-01 | 2017-06-30 |
| 2017 | 2 | 2016-01-01 | 2017-06-30 |
| 2017 | 3 | 2016-01-01 | 2017-06-30 |
| 2017 | 4 | 2016-01-01 | 2017-06-30 |
| 2017 | 5 | 2016-01-01 | 2017-06-30 |
| 2017 | 6 | 2016-01-01 | 2017-06-30 |
答案 2 :(得分:0)
Sargable是一个强项。使用dates
更好地处理是一个重点。这里有一些要点。
11列指数几乎可以保证是浪费。即使是6列指数也不太可能被充分利用。仅使用索引的最左侧列。通常它会到达下一列无效的点,因此它会停止。
通常,将日期分为年,月和日并不是一个好主意。由于您似乎只需要年份和月份,因此建议CHAR(7) CHARSET ascii
的值为&#39; 2017-06&#39;。或者你真的有报告在一个月中停止吗?
请使用所涉及的表限定每个列名称。非常重要的是,我们知道哪个表格,例如fxid
。
请使用JOIN .. ON
语法:
FROM vehicles AS a
JOIN exchange_rates AS b ON a.currency = b.currency
(我希望AS v
和AS er
作为助记符。)
所需索引(当前年/月列):
b,d: INDEX(fxid, currency, year)
a: INDEX(cid, currency, report_year)
c: INDEX(kid, cid, currency, report_year)
有关创建索引的更多信息:http://mysql.rjweb.org/doc.php/index_cookbook_mysql