我有一个简单的data
表用于时间序列数据,基本上只有时间戳和值:
Field, Type, Null, Key, Default, Extra
'id','int(11)','NO','PRI',NULL,'auto_increment'
'channel_id','int(11)','YES','MUL',NULL,''
'timestamp','bigint(20)','NO','',NULL,''
'value','double','NO','',NULL,''
我开发了一种稍微复杂的查询,用于按周期计算加权平均值,基本上是计算sum(val x delta time)
。 @prev_timestamp
变量基本上模拟LAG()
函数。示例:
SELECT
MAX(agg.timestamp) AS timestamp,
COALESCE(
SUM(agg.val_by_time) / (MAX(agg.timestamp) - MIN(agg.prev_timestamp)),
AVG(agg.value)
) AS value
FROM (
SELECT
timestamp,
value,
value * (timestamp - @prev_timestamp) AS val_by_time,
COALESCE(@prev_timestamp, 0) AS prev_timestamp,
@prev_timestamp := timestamp
FROM data
CROSS JOIN (
SELECT @prev_timestamp := NULL
) AS vars
WHERE channel_id=56 AND timestamp >= 1546297161097 AND timestamp <= 1552950000000
ORDER BY timestamp ASC
) AS agg
GROUP BY (timestamp - 1546288811393) >> 23
ORDER BY timestamp ASC
;
最近,我已从hacky MySQL变量方法中将此查询转换为使用窗口函数:
SELECT
MAX(agg.timestamp) AS timestamp,
COALESCE(
SUM(agg.val_by_time) / (MAX(agg.timestamp) - MIN(agg.prev_timestamp)),
AVG(agg.value)
) AS value
FROM (
SELECT
timestamp,
value,
LAG(timestamp) OVER(ORDER BY channel_id, timestamp ASC) AS prev_timestamp,
value * (timestamp - LAG(timestamp) OVER(ORDER BY channel_id,timestamp ASC)) AS val_by_time
FROM data
WHERE channel_id=56
AND timestamp >= 1546297161097 AND timestamp <= 1552950000000
ORDER BY timestamp ASC
) AS agg
GROUP BY (timestamp - 1546288811393) >> 23
ORDER BY timestamp ASC
;
不利之处在于,使用窗口功能的版本虽然看上去更干净,但运行速度却慢了约25%(MariaDB 10.3,Raspi 3)。
通过更好地利用window函数(也许摆脱派生表),是否存在一种更好(性能更高)的方法来编写此查询?