Question

Q1：为什么count（*）比count（distinct col）慢得多？

Q2：id应该始终使用count（distinct col）吗？

select count(id) from source;
+-----------+
| count(id) |
+-----------+
|     22713 |
+-----------+
1 row in set (0.73 sec)

mysql> select count(distinct id) from source;
+--------------------+
| count(distinct id) |
+--------------------+
|              22836 |
+--------------------+
1 row in set (0.08 sec)

Answer 1

如果列已编制索引，COUNT(DISTINCT id)只需返回列索引中的项目数。 COUNT(id)必须将每个索引条目指向的行数相加，或扫描所有行。

关于第二个问题，请参阅count(*) and count(column_name), what's the diff?。大多数情况下，COUNT(*)是最合适的;在某些情况下，例如计算与外部联接连接的行，您需要使用COUNT(columnname)，因为您不想计算空行。

Answer 2

如果查询被mysql缓存

，它也可能更快

这是我的测试，大约有150万行，而id是auto_increment PK

Answer 3

1）确保未缓存查询结果

2）似乎ID列有NULL参数和索引。在那种情况下，count（id）给出具有NOT NULL值的id的计数。如果列ID没有NULL参数 - 使用COUNT（*）。它为您提供行计数而不检查每行的“column！== null”

mysql COUNT（*）vs COUNT（DISTINCT col）

3 个答案: