MySQL时间序列使用窗口函数加权平均

时间:2019-03-18 12:19:37

标签: mysql mariadb query-performance window-functions

我有一个简单的data表用于时间序列数据,基本上只有时间戳和值:

Field, Type, Null, Key, Default, Extra
'id','int(11)','NO','PRI',NULL,'auto_increment'
'channel_id','int(11)','YES','MUL',NULL,''
'timestamp','bigint(20)','NO','',NULL,''
'value','double','NO','',NULL,''

我开发了一种稍微复杂的查询,用于按周期计算加权平均值,基本上是计算sum(val x delta time)@prev_timestamp变量基本上模拟LAG()函数。示例:

SELECT 
    MAX(agg.timestamp) AS timestamp, 
    COALESCE( 
        SUM(agg.val_by_time) / (MAX(agg.timestamp) - MIN(agg.prev_timestamp)), 
        AVG(agg.value)
    ) AS value
FROM ( 
    SELECT 
        timestamp, 
        value, 
        value * (timestamp - @prev_timestamp) AS val_by_time, 
        COALESCE(@prev_timestamp, 0) AS prev_timestamp, 
        @prev_timestamp := timestamp 
    FROM data 
    CROSS JOIN (
        SELECT @prev_timestamp := NULL
    ) AS vars 
    WHERE channel_id=56  AND timestamp >= 1546297161097 AND timestamp <= 1552950000000 
    ORDER BY timestamp ASC
) AS agg 
GROUP BY (timestamp - 1546288811393) >> 23 
ORDER BY timestamp ASC
;

最近,我已从hacky MySQL变量方法中将此查询转换为使用窗口函数:

SELECT 
    MAX(agg.timestamp) AS timestamp, 
    COALESCE(
        SUM(agg.val_by_time) / (MAX(agg.timestamp) - MIN(agg.prev_timestamp)), 
        AVG(agg.value)
    ) AS value
FROM ( 
    SELECT 
        timestamp, 
        value, 
        LAG(timestamp) OVER(ORDER BY channel_id, timestamp ASC) AS prev_timestamp,
        value * (timestamp - LAG(timestamp) OVER(ORDER BY channel_id,timestamp ASC)) AS val_by_time
    FROM data 
    WHERE channel_id=56  
    AND timestamp >= 1546297161097 AND timestamp <= 1552950000000 
    ORDER BY timestamp ASC

) AS agg 
GROUP BY (timestamp - 1546288811393) >> 23 
ORDER BY timestamp ASC
;

不利之处在于,使用窗口功能的版本虽然看上去更干净,但运行速度却慢了约25%(MariaDB 10.3,Raspi 3)。

通过更好地利用window函数(也许摆脱派生表),是否存在一种更好(性能更高)的方法来编写此查询?

0 个答案:

没有答案