如果我有一个包含重复ID的表格,如果我使用GROUP BY id
,我会得到相同的结果,就像我使用SELECT DISTINCT(id)
一样,对吗?
那么我什么时候应该选择另一种选择?
答案 0 :(得分:4)
如果您需要汇总功能,则应使用GROUP BY
,例如SUM
,MAX
等。
如果您只需要对列进行分组,则它们是相同的(并使用相同的计划)。
请注意DISTINCT
不是函数,因此该子句:
SELECT DISTINCT(id), othercol
与
相同(列顺序除外)SELECT DISTINCT othercol, (id)
或只是
SELECT DISTINCT othercol, id
如果有相同id
但id
不同的记录,可能仍会在othercol
上提供重复。
答案 1 :(得分:2)
DISTINCT和GROUP BY通常会生成相同的查询计划,因此两个查询结构的性能应该相同。应使用GROUP BY将聚合运算符应用于每个组。如果你只需要删除重复项,那么使用DISTINCT。如果您正在使用子查询,那么该查询的执行计划会有所不同,因此在这种情况下您需要先检查执行计划,然后再决定哪个更快。
Example of DISTINCT:
SELECT DISTINCT Employee, Rank
FROM Employees
Example of GROUP BY:
SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank
Example of GROUP BY with aggregate function:
SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank
参考:Pinal Dave(http://blog.SQLAuthority.com)
答案 2 :(得分:0)
只是额外的信息:
如果要查询索引字段并使用LIMIT,最好使用GROUP BY而不是DISTINCT,因为它将使用索引,而不是临时表
请参阅以下链接:
http://dev.mysql.com/doc/refman/5.1/en/internal-temporary-tables.html
“如果存在ORDER BY子句和不同的GROUP BY子句,或者ORDER BY或GROUP BY包含连接队列中第一个表以外的表中的列,则会创建一个临时表”
示例:
MariaDB [my_db]> EXPLAIN SELECT DISTINCT p.data_prefix FROM my_table p;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| 1 | SIMPLE | p | range | NULL | data_prefix | 33 | NULL | 18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
MariaDB [my_db]> EXPLAIN SELECT DISTINCT p.data_prefix FROM my_table p limit 0,40;
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | p | range | NULL | data_prefix | 33 | NULL | 18 | Using index for group-by; Using temporary |
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
1 row in set (0.00 sec)
MariaDB [my_db]> EXPLAIN SELECT p.data_prefix FROM my_table p group by p.data_prefix;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| 1 | SIMPLE | p | range | NULL | data_prefix | 33 | NULL | 18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
MariaDB [my_db]> EXPLAIN SELECT p.data_prefix FROM my_table p group by p.data_prefix limit 0,40;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| 1 | SIMPLE | p | range | NULL | data_prefix | 33 | NULL | 18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
MariaDB [my_db]>
答案 3 :(得分:0)
与group by
相比,您更喜欢distinct
的示例。考虑一种场景,其中window function
(不一定是row_number())需要应用于不同的结果集。遵守操作顺序,您必须使用distinct
select id, row_number() over (order by id) as rn
from (select distinct id from my_table) t;
无需使用group by
select id, row_number() over (order by id) as rn
from my_table
group by id;
之所以可行,是因为window functions
在group by
之后但在distinct
之前应用