activities_log
有330万行,posts
有20K行。
当一些查询加入他们的过程超过15秒! (╥﹏╥)
我做错了什么?我可以做些什么来优化?
它正在此服务器上运行:
# QUERY:
select `posts`.`page_id` from `activities_log` left join `posts` on `posts`.`id` = `activities_log`.`post_id`;
3345753 rows in set (17.40 sec)
# EXPLAIN:
+----+-------------+----------------+------------+--------+---------------+------------------------------+---------+------------------------------------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+--------+---------------+------------------------------+---------+------------------------------------+---------+----------+-------------+
| 1 | SIMPLE | activities_log | NULL | index | NULL | activities_log_post_id_index | 145 | NULL | 3203032 | 100.00 | Using index |
| 1 | SIMPLE | posts | NULL | eq_ref | PRIMARY | PRIMARY | 144 | prod_api_v1.activities_log.post_id | 1 | 100.00 | NULL |
+----+-------------+----------------+------------+--------+---------------+------------------------------+---------+------------------------------------+---------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
select count(*) from `activities_log`;
+----------+
| count(*) |
+----------+
| 3345770 |
+----------+
1 row in set (1.04 sec)
show index from activities_log;
+----------------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| activities_log | 0 | PRIMARY | 1 | id | A | 2984883 | NULL | NULL | | BTREE | | |
| activities_log | 1 | activities_log_page_id_index | 1 | page_id | A | 343 | NULL | NULL | YES | BTREE | | |
| activities_log | 1 | activities_log_activity_id_index | 1 | activity_id | A | 15 | NULL | NULL | | BTREE | | |
| activities_log | 1 | activities_log_post_id_index | 1 | post_id | A | 43894 | NULL | NULL | YES | BTREE | | |
| activities_log | 1 | activities_log_session_token_index | 1 | session_token | A | 4431 | NULL | NULL | YES | BTREE | | |
| activities_log | 1 | activities_log_user_id_index | 1 | user_id | A | 17908 | NULL | NULL | YES | BTREE | | |
+----------------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.00 sec)
select count(*) from `posts`;
+----------+
| count(*) |
+----------+
| 19999 |
+----------+
1 row in set (0.00 sec)
show index from posts;
+-------+------------+--------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+--------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| posts | 0 | PRIMARY | 1 | id | A | 16647 | NULL | NULL | | BTREE | | |
| posts | 1 | posts_page_id_index | 1 | page_id | A | 324 | NULL | NULL | | BTREE | | |
| posts | 1 | posts_kind_post_id_index | 1 | kind_post_id | A | 8 | NULL | NULL | | BTREE | | |
| posts | 1 | posts_posted_by_index | 1 | posted_by | A | 31 | NULL | NULL | YES | BTREE | | |
+-------+------------+--------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
编辑:
此查询仅是示例。我真的不想要页面ID。一个真实的例子可能是:select * from activities_log left join posts on posts.id = activities_log.post_id where activities_log.page_id = X or posts.page_id = X
答案 0 :(得分:1)
应该为每个列使用复合冗余索引,而不是为每个列使用单独的索引
表
activities_log column ( post_id, page_id)
在post_id左位置(join子句中涉及的列),在page_id后面。。此列对于避免访问表数据并从索引表获取所有数据很有用
您应该记住,查询只能为每个涉及的表使用单个indexc
答案 1 :(得分:1)
我不知道您要完成什么。您有一个left join
,并且正在从 second 表返回行,因此这些行通常是NULL
。
尽管有您的疑问,我的最佳猜测是您希望所有具有活动的帖子的页面。在这种情况下,您可以将查询的短语设置为:
select p.page_id
from posts p
where exists (select 1
from activities_log al
where p.id = al.post_id
);
至少,这将返回更小的结果集。