我正在运行查询以获取每个用户在日期范围之间输入的总笔记。这是我正在运行的查询:
SELECT SQL_NO_CACHE
COUNT(notes.user_id) AS "Number of Notes"
FROM csu_users
JOIN notes ON notes.user_id = csu_users.user_id
WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31"
AND notes.system = 0
GROUP BY csu_users.user_id
关于我的设置的一些注意事项:
notes
表约为1GB,行数约为3,000,000 SQL_NO_CACHE
来确保准确的基准 EXPLAIN SELECT
的输出如下(我已尽力将其格式化):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE csu_users index user_id user_id 5 NULL 1 Using index
1 SIMPLE notes ref user_id,timestamp,system user_id 4 REFSYS_DEV.csu_users.user_id 152 Using where
我已应用以下索引:
notes
id
item_id
user_id
timestamp
(注意:这实际上是DATETIME
。这个名字只是误导,抱歉!) system
csu_users
id
user_id
我有什么想法可以加快速度吗?谢谢!
答案 0 :(得分:1)
如果我没有弄错,通过将时间戳转换为字符串表示形式,您将失去该列上索引的所有优点。尝试在比较中使用时间戳值
答案 1 :(得分:1)
csu_users
表是否必要?如果关系为1-1且用户ID始终存在,则可以改为运行此查询:
SELECT COUNT(notes.user_id) AS "Number of Notes"
FROM notes
WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31" AND notes.system = 0
GROUP BY notes.user_id
即使不是这种情况,您也可以在聚合和过滤后加入,因为所有条件都在notes
上:
select "Number of Notes"
from (SELECT notes.user_id, COUNT(notes.user_id) AS "Number of Notes"
FROM notes
WHERE notes.timestamp BETWEEN "2013-01-01" AND "2013-01-31" AND notes.system = 0
GROUP BY notes.user_id
) n join
csu_users cu
on n.user_id = cu.user_id