Question

我正在运行查询以获取每个用户在日期范围之间输入的总笔记。这是我正在运行的查询：

SELECT SQL_NO_CACHE 
    COUNT(notes.user_id) AS "Number of Notes"

FROM csu_users

JOIN notes      ON notes.user_id    = csu_users.user_id

WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31"
AND notes.system = 0

GROUP BY csu_users.user_id

关于我的设置的一些注意事项：

查询需要30到35秒才能运行，这对我们的系统而言太长了
这是一个InnoDB表
notes表约为1GB，行数约为3,000,000
我故意使用SQL_NO_CACHE来确保准确的基准

EXPLAIN SELECT的输出如下（我已尽力将其格式化）：

id  select_type table       type    possible_keys             key       key_len ref                           rows  Extra
1   SIMPLE      csu_users   index   user_id                   user_id   5       NULL                          1     Using index
1   SIMPLE      notes       ref     user_id,timestamp,system  user_id   4       REFSYS_DEV.csu_users.user_id  152   Using where

我已应用以下索引：

notes

主键 - id
item_id
user_id
timestamp （注意：这实际上是DATETIME。这个名字只是误导，抱歉！）
system

csu_users

主键 - id
user_id

我有什么想法可以加快速度吗？谢谢！

Answer 1

如果我没有弄错，通过将时间戳转换为字符串表示形式，您将失去该列上索引的所有优点。尝试在比较中使用时间戳值

Answer 2

csu_users表是否必要？如果关系为1-1且用户ID始终存在，则可以改为运行此查询：

SELECT COUNT(notes.user_id) AS "Number of Notes"
FROM notes 
WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31" AND notes.system = 0
GROUP BY notes.user_id

即使不是这种情况，您也可以在聚合和过滤后加入，因为所有条件都在notes上：

select "Number of Notes" from (SELECT notes.user_id, COUNT(notes.user_id) AS "Number of Notes" FROM notes WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31" AND notes.system = 0 GROUP BY notes.user_id ) n join csu_users cu on n.user_id = cu.user_id

优化需要大约30秒才能运行的MySQL查询

2 个答案: