查询与子查询中的聚合顺序之间的MySQL差异

时间:2013-11-02 00:49:52

标签: mysql aggregate

我有2个关于订购数据的查询:

查询1:

SELECT  * FROM    (
    SELECT      idprovince, COUNT(*) total
    FROM        cities
    JOIN        persons USE INDEX (index_5) USING (idcity)
    WHERE       is_tutor = 'Y'
    GROUP BY    idprovince
) A
ORDER BY total DESC

查询2:

SELECT      idprovince, COUNT(*) total
FROM        cities
JOIN        persons USE INDEX (index_5) USING (idcity)
WHERE       is_tutor = 'Y'
GROUP BY    idprovince
ORDER BY    total DESC

查询1返回数据的速度比查询2快得多,我的问题是使用查询排序和在子查询中使用它有什么大的区别?

注意:我的db版本是mysql-5.0.96-x64。人数约为40万,城市数据约为500。

更新 输出mysql说明命令:

查询1:

mysql> EXPLAIN
    -> SELECT  *
    -> FROM    (
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    -> ) A
    -> ORDER BY total DESC
    -> ;
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref                                | rows   | Extra                                        |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL                               |     34 | Using filesort                               |
|  2 | DERIVED     | persons    | ref    | index_5       | index_5 | 2       |                                    | 163316 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | cities     | eq_ref | PRIMARY       | PRIMARY | 4       | _myproject_lesaja_2.persons.idcity |      1 |                                              |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
3 rows in set (1.22 sec)

查询2:

mysql> EXPLAIN
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    ->     ORDER BY    total DESC;
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table   | type  | possible_keys | key         | key_len | ref   | rows   | Extra                                        |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
|  1 | SIMPLE      | cities  | index | PRIMARY       | FK_cities_1 | 4       | NULL  |      4 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | persons | ref   | index_5       | index_5     | 2       | const | 163316 | Using where                                  |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
2 rows in set (0.00 sec)

结果查询1:

mysql> SELECT  *
    -> FROM    (
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    -> ) A
    -> ORDER BY total DESC
    -> ;
+------------+-------+
| idprovince | total |
+------------+-------+
|         35 | 15797 |
......................
......................
......................

|         76 |  2091 |
|         65 |  2018 |
+------------+-------+
34 rows in set (0.78 sec)

结果查询2:

mysql> SELECT      idprovince, COUNT(*) total
    -> FROM        cities
    -> JOIN        persons USE INDEX (index_5) USING (idcity)
    -> WHERE       is_tutor = 'Y'
    -> GROUP BY    idprovince
    -> ORDER BY    total DESC;
+------------+-------+
| idprovince | total |
+------------+-------+
|         35 | 15797 |
|         33 | 14413 |
|         12 | 13683 |
......................
......................
......................
|         34 |  2135 |
|         76 |  2091 |
|         65 |  2018 |
+------------+-------+
34 rows in set (8 min 25.80 sec)

显示个人资料输出: QUERY 1:

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000240 |
| Opening tables       | 0.000043 |
| System lock          | 0.000004 |
| Table lock           | 0.000392 |
| optimizing           | 0.000084 |
| statistics           | 0.004455 |
| preparing            | 0.000026 |
| Creating tmp table   | 0.000221 |
| executing            | 0.000002 |
| Copying to tmp table | 0.913722 |
| Sorting result       | 0.000065 |
| Sending data         | 0.000020 |
| removing tmp table   | 0.000145 |
| Sending data         | 0.000008 |
| init                 | 0.000017 |
| optimizing           | 0.000002 |
| statistics           | 0.000038 |
| preparing            | 0.000007 |
| executing            | 0.000001 |
| Sorting result       | 0.000012 |
| Sending data         | 0.000337 |
| end                  | 0.000002 |
| end                  | 0.000002 |
| query end            | 0.000002 |
| freeing items        | 0.000020 |
| closing tables       | 0.000001 |
| removing tmp table   | 0.000074 |
| closing tables       | 0.000003 |
| logging slow query   | 0.000001 |
| cleaning up          | 0.000003 |
+----------------------+----------+

QUERY 2:

+----------------------+------------+
| Status               |   Duration |
+----------------------+------------+
| starting             |   0.000195 |
| Opening tables       |   0.000029 |
| System lock          |   0.000004 |
| Table lock           |   0.000011 |
| init                 |   0.000078 |
| optimizing           |   0.000021 |
| statistics           |   0.003399 |
| preparing            |   0.000025 |
| Creating tmp table   |   0.000259 |
| Sorting for group    |   0.000007 |
| executing            |   0.000001 |
| Copying to tmp table | 506.711308 |
| Sorting result       |   0.000049 |
| Sending data         |   0.000298 |
| end                  |   0.000004 |
| removing tmp table   |   0.000150 |
| end                  |   0.000002 |
| end                  |   0.000002 |
| query end            |   0.000002 |
| freeing items        |   0.000013 |
| closing tables       |   0.000003 |
| logging slow query   |   0.000001 |
| logging slow query   |   0.000042 |
| cleaning up          |   0.000003 |
+----------------------+------------+

创建声明

CREATE TABLE persons (
    idperson INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    is_tutor ENUM('Y','N') NULL DEFAULT 'N',
    name VARCHAR(64) NOT NULL,
    ...
    idcity INT(10) UNSIGNED NOT NULL,
    ...
    PRIMARY KEY (idperson),
    UNIQUE INDEX index_3 (name) USING BTREE,
    UNIQUE INDEX index_4 (email) USING BTREE,
    INDEX index_5 (is_tutor),
    ...
    CONSTRAINT FK_persons_1 FOREIGN KEY (idcity) REFERENCES cities (idcity)
)
ENGINE=InnoDB
AUTO_INCREMENT=414738;

CREATE TABLE cities (
    idcity INT(10) UNSIGNED NOT NULL,
    idprovince INT(10) UNSIGNED NOT NULL,
    city VARCHAR(64) NOT NULL,
    PRIMARY KEY (idcity),
    UNIQUE INDEX index_3 (city),
    INDEX FK_cities_1 (idprovince),
    CONSTRAINT FK_cities_1 FOREIGN KEY (idprovince) REFERENCES provinces (idprovince)
)
ENGINE=InnoDB;

1 个答案:

答案 0 :(得分:0)

我确实不是这方面的专家,但在ORDER BY 优化上查看MySQL Documentation,您不仅有一个而是两个未优化使用{{1在您的第2号查询中:

ORDER BY

第一个:

用于获取行的密钥

SELECT      idprovince, COUNT(*) total
FROM        cities
JOIN        persons USE INDEX (index_5) USING (idcity)
WHERE       is_tutor = 'Y'
GROUP BY    idprovince
ORDER BY    total DESC

WHERE is_tutor = 'Y'

中使用的

不同

ORDER BY

第二个:

您有不同的ORDER BY total DESC ORDER BY表达式。

GROUP BY

在上述两种情况下,MySQL不会使用索引来解析GROUP BY idprovince ORDER BY total DESC ,尽管它可以使用索引来搜索与ORDER BY子句相匹配的行。

另一方面,您的第1号查询遵循WHERE的优化形式,尽管ORDER BY在子查询之外使用。

因此可能是查询号2比查询号1慢得多的原因。

此外,在这两种情况下,ORDER BY在解析Index (idCity)时几乎无用,因为索引使用ORDER BYidCity子句使用ORDER BY这是一个Total汇总结果。

请参阅讨论here