Question

我有一个简单的查询，需要超过14秒。

select 
     e.title, e.date, v.name, v.city, v.region, v.country

from seminar e force index for join (venueid) 
     left join venues v on e.venueid = v.id 

where v.country = 'US'
     and v.city = 'New York' 
     and v.region = 'NY'
     and e.date > curdate() 
     and e.someid != 0

注意：count（e.id）代表用于调试目的的缩写。事实上，我们从两个表中获取信息。

解释给出了这个：

+----+-------------+-------+-------------+--------------------------------------------------------------------------------------+--------------------------+---------+-----------------+------+--------------------------------------------------------+
| id | select_type | table | type        | possible_keys                                                                        | key                      | key_len | ref             | rows | Extra                                                  |
+----+-------------+-------+-------------+--------------------------------------------------------------------------------------+--------------------------+---------+-----------------+------+--------------------------------------------------------+
|  1 | SIMPLE      | v     | index_merge | PRIMARY,city,country,region | city,region | 378,378 | NULL            |    2 | Using intersect(city,region); Using where |
|  1 | SIMPLE      | e     | ref         | venueid                     |  venueid    | 5       | v.id            |   11 | Using where                                            |
+----+-------------+-------+-------------+--------------------------------------------------------------------------------------+--------------------------+---------+-----------------+------+--------------------------------------------------------+

我在e.id，e.date，e.someid以及v.id，v.country，v.city和v.region上都有索引。

我知道db-setup是一团糟，但这就是我现在要处理的问题。

为什么SQL需要很长时间才会有一个约。算150？在活动中，大约有1M个参赛作品，场地大约有100K。

两个表都是MyISAM。任何想法如何改善这个？

创建像这样的索引

create index location on venues (city, region, country)

需要20秒，解释是：

+----+-------------+-------+------+--------------------------------------+--------------+---------+-------------------+------+------------------------------------+
| id | select_type | table | type | possible_keys                        | key          | key_len | ref               | rows | Extra                              |
+----+-------------+-------+------+--------------------------------------+--------------+---------+-------------------+------+------------------------------------+
|  1 | SIMPLE      | v     | ref  | PRIMARY,city,country,region,location | location     | 765     | const,const,const |  410 | Using index condition; Using where |
|  1 | SIMPLE      | e     | ref  | EventVenueID                         | venueid      | 5       | v.id              |   11 | Using where                        |
+----+-------------+-------+------+--------------------------------------+--------------+---------+-------------------+------+------------------------------------+

Answer 1

您有left join venues，但您在加入的where行的venues子句中有条件，因此只会返回已加入的行。然而，这是一个副作用 - 请继续阅读为什么你根本不需要加入。

接下来，如果该城市为vancouver，则无需也测试国家/地区。

最后，如果你想找到“温哥华有多少未来的活动”，你不需要加入，因为场地ID是一个常数！

试试这个：

select count(*) as event_count
from events
where venueid = (select id from venues where city = 'vancouver')
and startdate > curdate() 
and te_id != 0

Mysql将使用venueid上的索引，而无需使用提示。如果没有，请执行以下命令：

analyze events

将更新索引列中数据分布的统计信息。请注意，如果您的很多活动都在温哥华，那么不使用索引会更有效率（因为无论如何都必须访问大多数行）。

Answer 2

这会使查询的第一部分更快：

INDEX(city, region, country)

Answer 3

我采用了另一种方式，因为MySQL似乎无法有效地处理连接：

创建了一个包含我需要的所有列的新大表
所以研讨会和活动现在在一个表格中
添加了索引

现在查询很快。不知道为什么......

从25秒开始，我们降到了.08秒

这就是我想要的。

如果有人仍然知道原因，欢迎您提供答案。

如何使用join改进这个MySQL Query？

3 个答案:

从25秒开始，我们降到了.08秒