我有一个由每分钟数据组成的数据集。我的目标是返回逐分钟记录,并添加计算,创建过去24小时内某个字段的总和,从每分钟记录开始计算。
我的查询如下:
SELECT main.recorded_at AS x, (SELECT SUM(precipitation) FROM data AS sub WHERE sub.host = main.host sub.recorded_at BETWEEN SUBTIME(main.recorded_at, '24:00:00') AND main.recorded_at) AS y FROM data AS main WHERE host = 'xxxx' ORDER BY x ASC;
是否有更有效的方法来编写此查询?到目前为止,我尝试过使用LEFT JOINS和不同的GROUP BYs。
当我解释这个查询时,我得到以下内容:
1 PRIMARY main ref host host 767 const 4038 100.00 Using where; Using filesort
2 DEPENDENT SUBQUERY sub ref host,recorded_at host 767 const 4038 100.00 Using where
总的来说,查询大约需要200秒才能运行8000条记录,并且一直变慢。我的目标是为每个结果获得24小时的总降水量,并且不到两秒钟就可以得到。
也许我会以错误的方式解决这个问题?我愿意接受其他途径的建议,以获得相同的结果。 :)
谢谢! 〜麦克
答案 0 :(得分:0)
假设我正确理解了您的问题,看起来您可以使用SUM
和CASE
来实现相同的结果而不使用相关的子查询。
SELECT recorded_at AS x,
SUM(CASE WHEN recorded_at BETWEEN SUBTIME(recorded_at, '24:00:00') AND recorded_at
THEN precipitation END) As y
FROM data
WHERE host = 'xxxx'
GROUP BY recorded_at
ORDER BY x ASC;
虽然我不确定这会产生更好的效果,但我认为它会使用OUTER JOIN
GROUP BY
来解决您的问题:
SELECT main.recorded_at AS x,
SUM(sub.precipitation) As y
FROM data main LEFT JOIN data sub ON
main.host = sub.host AND
sub.recorded_at BETWEEN SUBTIME(main.recorded_at, '24:00:00') AND main.recorded_at
WHERE main.host = 'xxxx'
GROUP BY main.recorded_at
ORDER BY x ASC;