编辑:releases.date
现在是DATE
类型
我正在运行以下查询,大约需要5秒才能运行。由于这是一个Web应用程序,因此速度太慢而无法使用。
SELECT releases.* ,COUNT(charts_extended.release_id) as num
FROM releases_all releases force index (date)
JOIN recommendations
ON releases.id=recommendations.release_id
JOIN charts_extended
ON charts_extended.release_id=releases.id
LEFT JOIN charts_extended ce
ON ce.release_id=releases.id
AND ce.artist='Si Quick'
LEFT JOIN dislike
ON dislike.release_id=releases.id
AND dislike.user='Si Quick'
WHERE dislike.release_id IS NULL
AND ce.release_id IS NULL
AND recommendations.user='Si Quick'
AND datediff(now(),releases.date) >=0
GROUP BY releases.id
ORDER BY releases.date DESC
LIMIT 0,41
EXPLAIN返回以下内容:
1 SIMPLE releases ALL NULL NULL NULL NULL 77226 Using where; Using temporary; Using filesort
1 SIMPLE ce ref release_id,artist release_id 4 soundshe.releases.id 4 Using where; Not exists
1 SIMPLE recommendations ref user,release_id release_id 4 soundshe.releases.id 39 Using where
1 SIMPLE dislike ref release_id,user user 203 const 105 Using where
1 SIMPLE charts_extended ref release_id release_id 4 soundshe.releases.id 4 Using index
Using temporary
和Using filesort
大大减慢了查询速度。如果我删除ORDER BY releases.date DESC
命令,则查询将在~1秒内运行。
releases.date
字段中的数据格式为YYYY-MM-DD
,且为VARCHAR
类型。
如何加快ORDER BY
的速度?我在该字段上设置了一个索引。
答案 0 :(得分:4)
为什么慢?
性能问题就是我所说的" inflate-deflate"。首先,JOINs
会增加'行的数量。看一下,然后GROUP BY
缩小回(在这种情况下)行数(或者更少),而不是原始表。
因此,GROUP BY
有两个目的 - 聚合(COUNT
)和' deflate'。让我们看看我们是否可以避免放气。
因为它不能只关注41个所需的行,它将收集整个releases_all
表,对其进行排序,然后提供41行。如果有TEXT
或BLOB
列,则此任务的庞大可能会很糟糕。
FORCE不好
但首先,让我解释为什么force index (date)
无用或伤害表现。 GROUP BY
必须在ORDER BY
之前完成(除非他们可以同时完成,但他们不能这样做。)
PRIMARY KEY(id)
可能用于帮助GROUP BY
,但之后的任何内容(即ORDER BY
)都不能使用索引。所以,再次摆脱GROUP BY
可能有所帮助。
更好查询概述
以下是目标:
SELECT r2.*,
( SELECT release_id,
COUNT(charts_extended.release_id) as num
FROM release_id
JOIN ...
WHERE ...
ORDER BY r1.date DESC
LIMIT 41
OFFSET 0
) as num
FROM releases_all AS r2
ORDER BY releases.date DESC
LIMIT 0, 41;
注意:
GROUP BY
已消失,至少从外部查询中消失。ORDER BY
(因为优化工具没有义务保留子查询的顺序)。OFFSET
仅在子查询中显示。 (我假设你是"分页"。这里讨论为什么"为什么通过OFFSET"分页是坏的,可能的改进:http://mysql.rjweb.org/doc.php/pagination)INDEX(date)
是必需的。另外,子查询是原始查询,但仅选择
SELECT r1.id, r1.date
但是,在完成查询之前,您需要确定x.user='Si Quick'
的生活位置。无论是ON
还是WHERE
,都会有所不同。
可能的优化
JOINed
表格中的41行。COUNTs
,但在执行SELECT release.*
时仅触及41(请参阅对BLOB
的评论)。INDEX(date, id)
可以帮助;加上它。潜在问题
你想要哪41行?由于所有JOINs
,它不一定是最后一个' (ORDER BY date desc
)41;你可能会跳过其中一些。
我无法完成任务,因为我不了解这些关系。 releases_all:建议1:1? 1:有多少?有时1:0?请告诉我们,以及JOINs
和LEFT JOIN
中所需的其他关系。
我还不知道(还)子查询是否可以有效地传递" last" 41行,或者是否必须在仅安排41之前计算所有COUNTs
。
完成子查询
删除COUNT(charts_extended.release_id)
以外的所有内容,看看SELECT
获取COUNT
的简单程度。您可能完全不能触及releases_all
。
现在效率低下
另外,不要在函数内隐藏索引列"。也就是说,而不是
AND datediff(now(),releases.date) >=0
使用
AND releases.date <= CURDATE()
(这也解决了DATE
vs DATETIME
问题。)
答案 1 :(得分:1)
从最简单到最艰难,最让人感到沮丧的是:
仅选择您需要的字段。 *实际上可以增加相当大的开销。只需尝试删除该位,看看您获得了多少改进。
SELECT COUNT(charts_extended.release_id) as num
考虑使用整数字段作为索引,因为日期可以重复。如果日期实际上是日期而不是日期时间,则索引更糟。我不认为这实际上是根据你的解释陈述做的事情。
FROM releases_all releases force index (date)
确保WHERE子句中的所有内容都有索引设置。
在此处手动传递日期时间而不是now()
,以便它可以缓存,这将每次从头开始烹饪结果集。你正在寻找过去的东西,所以你可以用明天的明确日期来做这件事,因为音乐专辑/唱片通常在9月28日周二发布,而不是9月28日周二发布 at上午09时即可。这将完成一次工作而不是每次参数。这当然是在SQL语句中使用参数。
您可以尝试在此期间手动插入日期并运行查询两次以查看我的意思:
AND datediff("2017-06-28 00:00:00",releases.date) >=0
GROUP BY releases.id
ORDER BY releases.date DESC
LIMIT 0,41
以下是您注册参数的新改进查询,运行两次以查看是否有任何改进:
SELECT COUNT(charts_extended.release_id) as num
FROM releases_all releases
JOIN recommendations
ON releases.id=recommendations.release_id
JOIN charts_extended
ON charts_extended.release_id=releases.id
LEFT JOIN charts_extended ce
ON ce.release_id=releases.id
AND ce.artist='Si Quick'
LEFT JOIN dislike
ON dislike.release_id=releases.id
AND dislike.user='Si Quick'
WHERE dislike.release_id IS NULL
AND ce.release_id IS NULL
AND recommendations.user='Si Quick'
AND datediff("2017-06-28 00:00:00",releases.date) >0
GROUP BY releases.id
ORDER BY releases.date DESC
LIMIT 0,41
答案 2 :(得分:0)
如果无法访问您的表格和内容,则需要尝试设置相关的测试用例。对于每个表,我设置了我认为适合您的查询的二级索引。另外我建议使用NOT EXISTS而不是LEFT OUTER来讨厌。
我使用的一系列查询与解释计划配对,以便您可以跟踪查询的每个附加部分的效果。
这是否更快,除非你进行试验,否则我无法知道。
可能是因为我的测试数据是如此之小,整个文件都在使用。
create table if not exists releases_all ( id mediumint not null auto_increment , `date` date not null , name varchar(80) , blurb varchar(200) , PRIMARY KEY (id) );
CREATE INDEX idx_release_date ON releases_all (`date`);
insert into releases_all (`date`, name, blurb) values (STR_TO_DATE('2014-01-15', '%Y-%m-%d'), 'blah', 'ub1') ,(STR_TO_DATE('2014-05-05', '%Y-%m-%d'), 'blah', 'ub2') ,(STR_TO_DATE('2015-02-25', '%Y-%m-%d'), 'blah', 'ub1') ,(STR_TO_DATE('2015-05-02', '%Y-%m-%d'), 'blah', 'ub2') ,(STR_TO_DATE('2016-04-08', '%Y-%m-%d'), 'blah', 'ub1') ,(STR_TO_DATE('2016-07-15', '%Y-%m-%d'), 'blah', 'ub2') ,(STR_TO_DATE('2016-03-01', '%Y-%m-%d'), 'blah', 'ub3') ,(STR_TO_DATE('2017-02-28', '%Y-%m-%d'), 'blah', 'ub2') ,(STR_TO_DATE('2017-06-19', '%Y-%m-%d'), 'blah', 'ub3') ;
create table if not exists recommendations ( id mediumint not null auto_increment, release_id mediumint not null, `user` varchar(20), PRIMARY KEY (id) );
CREATE INDEX idx_user_release ON recommendations (`user`, release_id);
insert into recommendations (release_id, `user`) values (1, 'user1') , (2, 'user1') , (3, 'user1') , (4, 'user1') , (5, 'user1') , (6, 'user1') , (7, 'user1') , (8, 'user1') , (9, 'user1') ;
create table if not exists charts_extended ( id mediumint not null auto_increment, release_id mediumint not null, artist varchar(20), PRIMARY KEY (id) );
CREATE INDEX idx_artist_release ON charts_extended (artist, release_id);
insert into charts_extended (release_id, artist) values (1, 'Si Quick') , (2, 'Si Quick') , (3, 'Si Quick') , (4, 'Si Quick') , (5, 'Si Quick') , (6, 'Si Quick') , (7, 'Si Quick') , (8, 'Si Quick') , (9, 'Si Quick') ;
create table if not exists dislike ( id mediumint not null auto_increment, release_id mediumint not null, `user` varchar(20), PRIMARY KEY (id) );
CREATE INDEX idx_user_release ON dislike (`user`, release_id);
insert into dislike (release_id, `user`) values (1, 'Si Quick') , (2, 'Si Quick') , (3, 'Si Quick') , (4, 'Si Quick') ;
SELECT releases.* FROM releases_all releases force index (idx_release_date) where NOT EXISTS (SELECT NULL FROM dislike WHERE dislike.release_id=releases.id AND dislike.user='Si Quick' ) ORDER BY releases.`date` ;
id | date | name | blurb -: | :--------- | :--- | :---- 7 | 2016-03-01 | blah | ub3 5 | 2016-04-08 | blah | ub1 6 | 2016-07-15 | blah | ub2 8 | 2017-02-28 | blah | ub2 9 | 2017-06-19 | blah | ub3
explain extended SELECT releases.* FROM releases_all releases force index (idx_release_date) where NOT EXISTS (SELECT NULL FROM dislike WHERE dislike.release_id=releases.id AND dislike.user='Si Quick' ) ORDER BY releases.`date` ;
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra -: | :----------- | :------- | :---- | :--------------- | :--------------- | :------ | :---- | ---: | -------: | :----------------------- 1 | PRIMARY | releases | index | null | idx_release_date | 3 | null | 9 | 100.00 | Using where 2 | MATERIALIZED | dislike | ref | idx_user_release | idx_user_release | 23 | const | 4 | 100.00 | Using where; Using index
SELECT releases.* FROM releases_all releases force index (idx_release_date) LEFT JOIN charts_extended ce ON ce.release_id=releases.id AND ce.artist='Si Quick' where NOT EXISTS (SELECT NULL FROM dislike WHERE dislike.release_id=releases.id AND dislike.user='Si Quick' ) ORDER BY releases.`date` ;
id | date | name | blurb -: | :--------- | :--- | :---- 7 | 2016-03-01 | blah | ub3 5 | 2016-04-08 | blah | ub1 6 | 2016-07-15 | blah | ub2 8 | 2017-02-28 | blah | ub2 9 | 2017-06-19 | blah | ub3
explain extended SELECT releases.* FROM releases_all releases force index (idx_release_date) LEFT JOIN charts_extended ce ON ce.release_id=releases.id AND ce.artist='Si Quick' where NOT EXISTS (SELECT NULL FROM dislike WHERE dislike.release_id=releases.id AND dislike.user='Si Quick' ) ORDER BY releases.`date` ;
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra -: | :----------- | :------- | :--- | :----------------- | :----------------- | :------ | :-------------------------------------------- | ---: | -------: | :-------------------------- 1 | PRIMARY | releases | ALL | null | null | null | null | 9 | 100.00 | Using where; Using filesort 1 | PRIMARY | ce | ref | idx_artist_release | idx_artist_release | 26 | const,fiddle_NNENRNYWCZCRQJZAHBRY.releases.id | 1 | 100.00 | Using where; Using index 2 | MATERIALIZED | dislike | ref | idx_user_release | idx_user_release | 23 | const | 4 | 100.00 | Using where; Using index
SELECT releases.* FROM releases_all releases force index (idx_release_date) INNER JOIN recommendations rec ON releases.id = rec.release_id LEFT JOIN charts_extended ce ON releases.id = ce.release_id AND ce.artist = 'Si Quick' where NOT EXISTS (SELECT NULL FROM dislike WHERE releases.id = dislike.release_id AND dislike.user = 'Si Quick' ) and rec.user = 'user1' ORDER BY releases.`date` ;
id | date | name | blurb -: | :--------- | :--- | :---- 7 | 2016-03-01 | blah | ub3 5 | 2016-04-08 | blah | ub1 6 | 2016-07-15 | blah | ub2 8 | 2017-02-28 | blah | ub2 9 | 2017-06-19 | blah | ub3
explain extended SELECT releases.* FROM releases_all releases force index (idx_release_date) INNER JOIN recommendations rec ON releases.id = rec.release_id LEFT JOIN charts_extended ce ON releases.id = ce.release_id AND ce.artist = 'Si Quick' where NOT EXISTS (SELECT NULL FROM dislike WHERE releases.id = dislike.release_id AND dislike.user = 'Si Quick' ) and rec.user = 'user1' ORDER BY releases.`date` ;
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra -: | :----------- | :------- | :--- | :----------------- | :----------------- | :------ | :-------------------------------------------- | ---: | -------: | :-------------------------- 1 | PRIMARY | releases | ALL | null | null | null | null | 9 | 100.00 | Using where; Using filesort 1 | PRIMARY | rec | ref | idx_user_release | idx_user_release | 26 | const,fiddle_NNENRNYWCZCRQJZAHBRY.releases.id | 1 | 100.00 | Using where; Using index 1 | PRIMARY | ce | ref | idx_artist_release | idx_artist_release | 26 | const,fiddle_NNENRNYWCZCRQJZAHBRY.releases.id | 1 | 100.00 | Using where; Using index 2 | MATERIALIZED | dislike | ref | idx_user_release | idx_user_release | 23 | const | 4 | 100.00 | Using where; Using index
dbfiddle here
答案 3 :(得分:-1)
请尝试此查询。我已将'AND'条件从主'WHERE'条件移动到其各自的'LEFT'连接条件。这将有助于LEFT联接以过滤行,因此只处理所需的行。由于'ORDER BY'命令只需要处理所需的行。我希望有所帮助。
SELECT releases.* ,COUNT(charts_extended.release_id) as num
FROM releases_all releases force index (date)
JOIN recommendations ON releases.id=recommendations.release_id AND recommendations.user='Si Quick'
JOIN charts_extended ON charts_extended.release_id=releases.id
LEFT JOIN charts_extended ce ON ce.release_id=releases.id AND ce.artist='Si Quick' AND ce.release_id IS NULL
LEFT JOIN dislike ON dislike.release_id=releases.id AND dislike.user='Si Quick' AND dislike.release_id IS NULL
WHERE datediff(now(),releases.date) >=0
GROUP BY releases.id
ORDER BY releases.date DESC
LIMIT 0,41