GROUP BY x与DISTINCT(x)

时间:2014-06-20 21:28:14

标签: mysql group-by distinct

如果我有一个包含重复ID的表格,如果我使用GROUP BY id,我会得到相同的结果,就像我使用SELECT DISTINCT(id)一样,对吗?

那么我什么时候应该选择另一种选择?

4 个答案:

答案 0 :(得分:4)

如果您需要汇总功能,则应使用GROUP BY,例如SUMMAX等。

如果您只需要对列进行分组,则它们是相同的(并使用相同的计划)。

请注意DISTINCT不是函数,因此该子句:

SELECT DISTINCT(id), othercol

相同(列顺序除外)
SELECT DISTINCT othercol, (id)

或只是

SELECT DISTINCT othercol, id
如果有相同idid不同的记录,

可能仍会在othercol上提供重复。

答案 1 :(得分:2)

DISTINCT和GROUP BY通常会生成相同的查询计划,因此两个查询结构的性能应该相同。应使用GROUP BY将聚合运算符应用于每个组。如果你只需要删除重复项,那么使用DISTINCT。如果您正在使用子查询,那么该查询的执行计划会有所不同,因此在这种情况下您需要先检查执行计划,然后再决定哪个更快。

Example of DISTINCT:
 SELECT DISTINCT Employee, Rank
 FROM Employees

Example of GROUP BY:
 SELECT Employee, Rank
 FROM Employees
 GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:
 SELECT Employee, Rank, COUNT(*) EmployeeCount
 FROM Employees
 GROUP BY Employee, Rank 

参考:Pinal Dave(http://blog.SQLAuthority.com

答案 2 :(得分:0)

只是额外的信息:

如果要查询索引字段并使用LIMIT,最好使用GROUP BY而不是DISTINCT,因为它将使用索引,而不是临时表

请参阅以下链接:

示例:

MariaDB [my_db]> EXPLAIN SELECT DISTINCT p.data_prefix FROM my_table p;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | p     | range | NULL          | data_prefix | 33      | NULL |   18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

MariaDB [my_db]> EXPLAIN SELECT DISTINCT p.data_prefix FROM my_table p limit 0,40;
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
| id   | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                                     |
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
|    1 | SIMPLE      | p     | range | NULL          | data_prefix | 33      | NULL |   18 | Using index for group-by; Using temporary |
+------+-------------+-------+-------+---------------+------------+---------+------+------+-------------------------------------------+
1 row in set (0.00 sec)

MariaDB [my_db]> EXPLAIN SELECT p.data_prefix FROM my_table p group by p.data_prefix;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | p     | range | NULL          | data_prefix | 33      | NULL |   18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

MariaDB [my_db]> EXPLAIN SELECT p.data_prefix FROM my_table p group by p.data_prefix limit 0,40;
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | p     | range | NULL          | data_prefix | 33      | NULL |   18 | Using index for group-by |
+------+-------------+-------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

MariaDB [my_db]>

答案 3 :(得分:0)

group by相比,您更喜欢distinct的示例。考虑一种场景,其中window function不一定是row_number())需要应用于不同的结果集。遵守操作顺序,您必须使用distinct

select id, row_number() over (order by id) as rn
from (select distinct id from my_table) t;

无需使用group by

的子查询就可以实现相同的目的
select id, row_number() over (order by id) as rn 
from my_table
group by id;

之所以可行,是因为window functionsgroup by之后但在distinct之前应用