Question

这是我的试用版：

mysql> select a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                          group by a.id;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.01 sec)

mysql> select distinct a.id from iask a
    ->                                  join ianswer b on a.id=b.iaskid
    ->                                  join users c on c.id=a.uid
    ->                          where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> explain extended select distinct a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                 |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary       |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where           |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index; Distinct |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)

mysql> explain extended select a.id from iask a
    ->                          join ianswer b on a.id=b.iaskid
    ->                          join users c on c.id=a.uid
    ->                  where (c.last_check is null or a.created>c.last_check) and c.id=1
    ->                  group by a.id;
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
| id | select_type | table | type  | possible_keys             | key              | key_len | ref      | rows | filtered | Extra                           |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
|  1 | SIMPLE      | c     | const | PRIMARY,i_users_lastcheck | PRIMARY          | 4       | const    |    1 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | a     | ref   | PRIMARY,i_iask_uid        | i_iask_uid       | 4       | const    |    1 |   100.00 | Using where                     |
|  1 | SIMPLE      | b     | ref   | i_ianswer_iaskid          | i_ianswer_iaskid | 4       | bbs.a.id |    7 |   100.00 | Using index                     |
+----+-------------+-------+-------+---------------------------+------------------+---------+----------+------+----------+---------------------------------+
3 rows in set, 1 warning (0.00 sec)

Answer 1

它们是两个接近但不完全相同的概念。在大多数查询（包括您提供的查询）上，它们是相同的，我不相信会有相同之处并且会进行相同的优化（尽管如果您想查看是否有任何差异，您可以解释扩展）（请参阅此处关于优化distinct和group by）。

所以总之，很可能是相同的，除非结果集有差异，然后你应该使用那个给你正确的结果集，所以没关系。

Answer 2

正如Alex所提到的，在不知道你的表结构的情况下，很难发表评论。

但是，DISTINCT和GROUP BY用于非常不同的目的。（一个好的SQL引用应该解释两者之间的区别。）试图确定哪一个更有效是没有意义的。

如果我正确地解释您的查询，您尝试返回新的“问题”ID列表，因为用户1做了一些事情来更改last_check的值。

如果我创建了表，a.id将是iask表的主键 - 因此，a.id的每个值都是唯一的。因此，DISTINCT和GROUP BY都不会对您的查询产生影响，因为

只有一行的值为a.id（不需要DISTINCT）;以及
GROUP BY没有任何内容，因为没有两行可以具有相同的a.id值（不需要GROUP BY）。

在这种情况下唯一有意义的查询只是

select a.id
from   iask a
       join ianswer b on a.id = b.iaskid
       join users c on c.id = a.uid
where  (c.last_check is null or a.created > c.last_check)
       and c.id = 1;

Answer 3

您发布的是EXPLAIN SELECT DISTINCT，而不是EXPLAIN SELECT ... GROUP BY（并且没有确切地看到您的表格和索引是如何为我们做的;-)。我希望这两者是相同的，这意味着性能没有差别;如果有不同（并且DB优化器很古怪，特别是MySQL，所以最好检查而不是猜测;-)他们应该清楚地指出哪种方法可以提供更好的性能。

PS：你应该对可以合理代表现实生活的（可能是假的）表进行EXPLAIN运行 - 优化器可能会决定进行全表扫描而不是索引，因为它看到玩具索引上的多样性太少了例如，它会使用具有更多多样性的实际数据索引。

distinct或group？哪个在mysql中效率更高？

3 个答案: