Question

我不了解以下两个查询的mysql的EXPLAIN输出。

在第一个查询中，mysql必须先选择1238264记录：

explain select
    count(distinct utc.id)
from
    user_to_company utc
inner join
    users u
        on utc.user_id=u.id
where
    u.is_removed=false
order by
    utc.user_id asc limit 20;

+----+-------------+--------+------+----------------------------+---------+---------+---------------------------------+---------+-------------+
| id | select_type | table  | type | possible_keys              | key     | key_len | ref                             | rows    | Extra       |
+----+-------------+--------+------+----------------------------+---------+---------+---------------------------------+---------+-------------+
|  1 | SIMPLE      | u      | ALL  | PRIMARY                    | NULL    | NULL    | NULL                            | 1238264 | Using where |
|  1 | SIMPLE      | utc    | ref  | user_id,FKF513E0271C2D1677 | user_id | 8       | u.id                            |       1 | Using index

在第二个查询中，添加了GROUP BY，这使得mysql只能选择20条记录：

explain select
    count(distinct utc.id)
from
    user_to_company utc
inner join
    users u
        on utc.user_id=u.id
where
    u.is_removed=false
group by
    utc.user_id
order by
    utc.user_id asc limit 20;

+----+-------------+--------+--------+----------------------------+--------------------+---------+-------------------------+------+-------------+
| id | select_type | table  | type   | possible_keys              | key                | key_len | ref                     | rows | Extra       |
+----+-------------+--------+--------+----------------------------+--------------------+---------+-------------------------+------+-------------+
|  1 | SIMPLE      | utc  | index  | user_id,FKF513E0271C2D1677 | FKF513E0271C2D1677   | 8       | NULL                    |   20 | Using index |
|  1 | SIMPLE      | u    | eq_ref | PRIMARY                    | PRIMARY              | 8       | utc.user_id             |    1 | Using where |
+----+-------------+--------+--------+----------------------------+--------------------+---------+-------------------------+------+-------------+

有关详细信息，users表中有1333194条记录，user_to_company表中有1327768条记录。

如何添加GROUP BY使mysql在第一遍中只选择20条记录？

Answer 1

第一个查询必须读取所有数据才能找到utc.id的所有值。它只返回一行，这是整个表的摘要。因此，它必须生成所有数据。

第二个查询为每个utc.user_id生成一个单独的总计。您有limit子句和utc.user_id上的索引。显然，MySQL足够聪明，可以识别它可以转到索引以获得utc.user_id的前20个值。它使用这些来生成计数。

我很惊讶MySQL很聪明地做到了这一点（尽管逻辑很好地记录了here）。但是，第二个查询可以通过这种方式进行优化，而第一个查询无法进行优化，这是完全合理的。

添加GROUP BY如何使此查询更有效？

1 个答案: