此查询的目的是列出某人与其有连接的不同用户(即,ID为256的用户所跟随或跟随的用户,但不包括阻止或被当前用户阻止的用户请求(ID为2的用户)
关系表非常简单。状态列可以是以下两个值之一:""或者"阻止":
mysql> describe relationships;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| follower_id | int(11) | YES | MUL | NULL | |
| followee_id | int(11) | YES | MUL | NULL | |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| status | varchar(191) | YES | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+
此查询目前大约需要58秒才能完成!用户256只有1500个连接。假设这是上下文,大约有10,000个用户行,5500个关系行。
SELECT DISTINCT `users`.*,
-- "followed" is just a flag indicating if user #2 is currently following a given user
(
SELECT COUNT(*) FROM `relationships`
WHERE `relationships`.`followee_id` = `users`.`id`
AND `relationships`.`follower_id` = 2
) AS 'followed'
FROM `users`
INNER JOIN `relationships`
ON (
(`users`.`id` = `relationships`.`follower_id`
AND `relationships`.`followee_id` = 256
)
OR (`users`.`id` = `relationships`.`followee_id`
AND `relationships`.`follower_id` = 256
)
)
WHERE `relationships`.`status` = 'following'
AND (
-- Ensure we don't return users who are blocked by user #2
`users`.`id` NOT IN (
SELECT `relationships`.`followee_id`
FROM `relationships`
WHERE `relationships`.`follower_id` = 2
AND `relationships`.`status` = 'blocked'
)
)
AND (
-- Ensure we don't return users who are blocking user #2
`users`.`id` NOT IN (
SELECT `relationships`.`follower_id`
FROM `relationships`
WHERE `relationships`.`followee_id` = 2
AND `relationships`.`status` = 'blocked'
)
)
ORDER BY `users`.`id` ASC
LIMIT 10
这是relationships
上的当前索引:
mysql> show index from relationships;
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| relationships | 0 | PRIMARY | 1 | id | A | 3002 | NULL | NULL | | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 1 | status | A | 2 | NULL | NULL | YES | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 2 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 0 | index_relationships_on_status_and_follower_id_and_followee_id | 3 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_followee_id | 1 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_follower_id | 1 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 1 | status | A | 2 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 2 | followee_id | A | 3002 | NULL | NULL | YES | BTREE | | |
| relationships | 1 | index_relationships_on_status_and_followee_id_and_follower_id | 3 | follower_id | A | 3002 | NULL | NULL | YES | BTREE | | |
+---------------+------------+---------------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
解释结果:
mysql> EXPLAIN SELECT DISTINCT `users`.*, (SELECT COUNT(*) FROM `relationships` WHERE `relationships`.`followee_id` = `users`.`id` AND `relationships`.`follower_id` = 2) AS 'followed' FROM `users` INNER JOIN `relationships` ON(`users`.`id` = `relationships`.`follower_id` AND `relationships`.`followee_id` = 256) OR (`users`.`id` = `relationships`.`followee_id` AND `relationships`.`follower_id` = 256) WHERE `relationships`.`status` = 'following' AND (`users`.`id` NOT IN (SELECT `relationships`.`followee_id` FROM `relationships` WHERE `relationships`.`follower_id` = 2 AND `relationships`.`status` = 'blocked')) AND (`users`.`id` NOT IN (SELECT `relationships`.`follower_id` FROM `relationships` WHERE `relationships`.`followee_id` = 2 AND `relationships`.`status` = 'blocked')) ORDER BY `users`.`id` ASC LIMIT 10;
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
| 1 | PRIMARY | relationships | index_merge | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_followee_id,index_relationships_on_follower_id | 5,5 | NULL | 2 | Using union(index_relationships_on_followee_id,index_relationships_on_follower_id); Using where; Using temporary; Using filesort |
| 1 | PRIMARY | users | ALL | PRIMARY | NULL | NULL | NULL | 1534 | Range checked for each record (index map: 0x1) |
| 4 | SUBQUERY | relationships | ref | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id | 767 | const | 1 | Using where; Using index |
| 3 | SUBQUERY | relationships | ref | index_relationships_on_status_and_follower_id_and_followee_id,index_relationships_on_followee_id,index_relationships_on_follower_id,index_relationships_on_status_and_followee_id_and_follower_id | index_relationships_on_status_and_follower_id_and_followee_id | 772 | const,const | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | relationships | ref | index_relationships_on_followee_id,index_relationships_on_follower_id | index_relationships_on_followee_id | 5 | development.users.id | 1 | Using where |
+----+--------------------+---------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+---------+-------------------------------+------+----------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)
答案 0 :(得分:1)
如果没有对此进行测试,很难给出具体的答案,但我认为这部分问题是问题
SELECT DISTINCT `users`.*, (
SELECT COUNT(*) FROM `relationships`
WHERE `relationships`.`followee_id` = `users`.`id`
AND `relationships`.`follower_id` = 2
) AS 'followed'
你也在使用order by。删除DISTINCT并按顺序查看是否加快了速度。我知道它会改变查询,但我怀疑不同的是基本上构建了一堆临时表,并将它们丢弃,以便检查它需要检查的每一行。看看这里
http://dev.mysql.com/doc/refman/5.7/en/distinct-optimization.html
计数可能很慢。确保计数在最快的列中运行。看到这个......
https://www.percona.com/blog/2007/04/10/count-vs-countcol/
思考SQL的好方法是在SETS中。幸运的是,MySQL支持子查询。
https://dev.mysql.com/doc/refman/5.7/en/from-clause-subqueries.html
一些伪SQL跟随......
select user_id
from relationships as follower, relationships as followee
where ...
在上面我们有两组我们可以操作。使用子查询,这非常有趣
select user_id
from (select user_id as f1 from relationships where ...) as follower,
(select user_id as f2 from relationships where ...) as followee
where ...
我总是发现像上面这样的一种简单方法来考虑自我引用表。
答案 1 :(得分:1)
很难确切地说明如何优化查询和结构,首先是一般提示:
无论如何,我已经准备好了sample fiddle with proposed optimizations,我已经更改了列的名称以减少混淆
最终查询可能如下所示:
(?:\d{2}|\d{4})[-\/\.\s]\d{2}[-\/\.\s](?:\d{4}|\d{2})
但是,它可以改写如下:
select *
from users a
where
(
id in (select follower_id as id from relationships USE INDEX (user_id) where user_id = 256 and status = 'following')
or id in (select user_id from relationships USE INDEX (follower_id) where follower_id = 256 and status = 'following')
)
and id not in (select follower_id from relationships USE INDEX (user_id) where user_id = 2 and status = 'blocked')
and id not in (select user_id from relationships USE INDEX (follower_id) where follower_id = 2 and status = 'blocked')
基准测试,尽管执行计划 - 实际数据的实际性能可能不同
答案 2 :(得分:0)
不要使用IN ( SELECT ... )
,它的优化效果不佳。相反,要么使用JOIN
,要么使用EXISTS ( SELECT ... )
。
OR
到UNION
技巧很好,但如果它仍在IN(...)
内,则不行。
(为了便于阅读,请在只有一个表格时省略表格名称。并重命名followee_id
和/或follower_id
;拼写时它们彼此距离太近。)