Question

我有一个查询可以从另一个子查询中选择。虽然两个查询看起来几乎相同，但第二个查询（在此示例中）运行速度要慢得多：

SELECT
   user.id
  ,user.first_name
  -- user.*
  FROM user
  WHERE
    user.id IN (SELECT ref_id 
                  FROM education 
                 WHERE ref_type='user' 
                   AND education.institute_id='58' 
                   AND education.institute_type='1'
                );

此查询需要1.2s解释此查询结果：

 id select_type table   type    possible_keys   key key_len ref rows    Extra
 1  PRIMARY         user   index    first_name  152 141192  Using where; Using index
 2  DEPENDENT SUBQUERY  education   index_subquery  ref_type,ref_id,institute_id,institute_type,ref_type_2  ref_id  4   func    1   Using where

第二个问题：

SELECT
  -- user.id
  -- user.first_name
  user.*
  FROM user
  WHERE
    user.id IN (SELECT ref_id 
                  FROM education 
                 WHERE ref_type='user' 
                   AND education.institute_id='58' 
                   AND education.institute_type='1'
                );

运行45秒，解释：

 id select_type table   type    possible_keys   key key_len ref rows    Extra
 1  PRIMARY user    ALL                 141192  Using where
 2  DEPENDENT SUBQUERY  education   index_subquery  ref_type,ref_id,institute_id,institute_type,ref_type_2  ref_id  4   func    1   Using where

如果我仅通过索引字段查询，为什么它会变慢？为什么两个查询都扫描用户表的全长？任何想法如何改进？

感谢。

Answer 1

我不确定为什么它在您只选择两列时选择使用索引但不选择所有列时使用索引，但最好只选择您需要的列。尝试JOIN而不是子查询可能更好：

SELECT
    user.id
    user.first_name
FROM user
JOIN education 
ON user.id = education.ref_id 
AND education.ref_type='user' 
AND education.institute_id='58' 
AND education.institute_type='1'

Answer 2

我曾多次将子查询的结果转换为临时表并使用内部联接严重来替换“子查询中的WHERE foo”，从而提高了性能。（比如，6.5分钟的查询变成了亚秒级查询。）

呃，这就是Mark Byers刚才所说的。

Answer 3

这就是我认为发生的事情：

查询计划程序会将查询转换为内部联接，这使得数据库可以在筛选出结果时自由地从任一表开始。

当您只从用户表中选择几个字段时，两个表的结果都很小，因此数据库可以选择哪个表将根据可以使用的索引来最有效地过滤哪个表。

当您从用户表中获取所有数据时，您强制它使用教育表来过滤用户表，因为中间结果反过来太大了。没有适合匹配的索引，因此您会得到一个表扫描，这会降低查询速度。

（请注意，如果某些术语是从SQL Server中着色的，那就是我经常使用的。）

mysql子查询奇怪的慢

3 个答案: