MySQL - 左连接需要太长时间,如何优化查询?

时间:2018-04-25 21:51:29

标签: mysql sql select left-join innodb

领导者可能有很多粉丝。当领导者添加包含条目notification_followersleader_id 1(表格中的ID为1,2)的帖子时,notifiable_id 0表会收到一条通知。当当前用户14后面有人时,同一个表会收到一条通知,其中包含条目leader_id 0notifiable_id 14(表格中的ID为3)。

notification_followers id为PRIMARY,除数据外的每个字段都是自己的索引

| id | uuid               | leader_id | notifable_id | data   | created_at
-----------------------------------------------------------------------------------
| 1  | 001w2cwfoqzp8F3... | 1         | 0            | Post A | 2018-04-19 00:00:00
| 2  | lvbuX4d5qCHJUIN... | 1         | 0            | Post B | 2018-04-20 00:00:00
| 3  | eEq5r5g5jApkKgd... | 0         | 14           | Follow | 2018-04-21 00:00:00

所有与关注者相关的通知现在都在一个地方,这是完美的。

我们现在需要检查用户14是否是leader_id 1的关注者,以了解是否向他们展示通知12。为此,我们扫描user_follows表以查看登录用户是否作为followed_id leader_id存在,以便他们了解通知,但前提是他们跟随领导 之前发布通知(新关注者不应该在关注用户时发布较旧的帖子通知,只有新用户)。

user_follows (id为PRIMARY,每个字段都是索引)

| id | leader_id | follower_id | created_at
----------------------------------------------------
| 1  | 1         | 14         |  2018-04-18 00:00:00 // followed before, has notifs
| 2  | 1         | 15         |  2018-04-22 00:00:00 // followed after, no notifs

最后要注意的是,用户应该知道是否读取了通知,这是notification_followers_read表的来源。它将follower_id与{{1}一起存储所有阅读通知以及notification_uuid时间戳。

read_at (notification_uuid上的综合索引,follower_id)

notification_followers_read

我们现在想要返回用户| notification_uuid | follower_id | read_at -------------------------------------------------------- qIXE97AP49muZf... | 17 | 2018-04-21 00:00:00 // not for 14, we ignore it 的自动递增nf.id desc所订购的最新10个通知。他们应该会看到来自14的所有3个通知,因为非这些通知已被此用户读取。前两个,因为他们跟随领导者之前领导者发布了帖子和第三个通知,因为他们被跟踪并且notification_followersnotifiable_id

以下是有效的查询,但耗时太长 ~9秒

14

SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at FROM notification_followers nf LEFT JOIN user_follows uf ON uf.leader_id = nf.leader_id AND uf.follower_id = 14 LEFT JOIN notification_followers_read nfr ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14 WHERE (nf.created_at > uf.created_at OR notifiable_id = 14) ORDER BY nf.id DESC LIMIT 10 有大约100K的记录,我们正在使用InnoDB。以下是查询的notification_followers

Explain

我们如何优化查询以便在几毫秒内运行?

使用UNION更新

以下是EXPLAIN查询的EXPLAIN,我还分别为每个子查询添加了UNION

EXPLAIN

enter image description here

使用SQL DUMP更新

SQL DUMP TO REPRODUCE LOCALLY只需在本地创建(SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at FROM notification_followers nf LEFT JOIN user_follows uf ON uf.leader_id = nf.leader_id AND uf.follower_id = 14 AND nf.created_at > uf.created_at LEFT JOIN notification_followers_read nfr ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14 ORDER BY nf.id DESC LIMIT 10) UNION DISTINCT (SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at FROM notification_followers nf LEFT JOIN notification_followers_read nfr ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14 WHERE nf.notifiable_id = 14 ORDER BY nf.id DESC LIMIT 10) ORDER BY id desc LIMIT 10 数据库并导入文件,即可查看所有表格数据(~100K行)的慢查询问题。

3 个答案:

答案 0 :(得分:0)

您的查询是:

SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at
FROM notification_followers nf LEFT JOIN
     user_follows uf
     ON uf.leader_id = nf.leader_id AND uf.follower_id = 14 LEFT JOIN
     notification_followers_read nfr
     ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14
WHERE nf.created_at > uf.created_at OR nf.notifiable_id = 14
ORDER BY nf.id DESC
LIMIT 10;

这有点难。 or子句是一个真正的杀手。但根据您的逻辑,我认为您需要更多and而不是or

SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at
FROM notification_followers nf LEFT JOIN
     user_follows uf
     ON uf.leader_id = nf.leader_id AND nf.created_at > uf.created_at AND 
        uf.follower_id = 14 LEFT JOIN
     notification_followers_read nfr
     ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14
WHERE nf.notifiable_id = 14
ORDER BY nf.id DESC
LIMIT 10;

(请注意,它会移至ON子句。)

明显的索引是:notification_followers(notifiable_id, leader_id, created_at)user_follows(leader_id, follower_id, created_at)notification_followers_read(notification_uuid, notifiable_id)

答案 1 :(得分:0)

OR经常会导致性能问题,因为它很难使用索引。将查询拆分为两种不同的情况,并将它们与UNION合并。

(SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at
FROM notification_followers nf
LEFT JOIN user_follows uf ON uf.leader_id = nf.leader_id AND uf.follower_id = 14 AND nf.created_at > uf.created_at
LEFT JOIN notification_followers_read nfr ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14
ORDER BY nf.id DESC
LIMIT 10)

UNION ALL

(SELECT nf.id, nf.uuid, nf.leader_id, nf.data, nf.created_at, nfr.read_at
FROM notification_followers nf
LEFT JOIN notification_followers_read nfr ON nf.uuid = nfr.notification_uuid AND nfr.follower_id = 14
WHERE nf.notifiable_id = 14
ORDER BY nf.id DESC
LIMIT 10)

ORDER BY id desc
LIMIT 10

答案 2 :(得分:0)

我使用您提供的转储文件在我的电脑上重现这个环境。原始查询的执行持续时间最初为0.8秒,没有任何架构更改。也许时差是因为我的数据库在SSD上运行?

无论如何,在添加以下索引时,执行持续时间减少到50毫秒。

ALTER TABLE `notification_followers` ADD INDEX `notification_followe_idx_id_uuid_at_id_data` (`leader_id`,`uuid`,`created_at`,`id`,`data`(255));
ALTER TABLE `notification_followers_read` ADD INDEX `notification_followe_idx_id_uuid_at` (`follower_id`,`notification_uuid`,`read_at`);
ALTER TABLE `user_follows` ADD INDEX `user_follows_idx_id_id_at` (`follower_id`,`leader_id`,`created_at`);