我正在尝试针对版本化的时间序列表编写postgres查询,以获取最新版本的可用数据。
我开始尝试在每个时间戳上获取最新版本,但这给我带来了一个问题,即时间戳的偏移量或不同的收集频率会将版本交织在一起,形成最终结果。一个示例是数据间隔是每两分钟一次,并且每分钟收集一次。下面是不同间隔的另一个示例。如果需要,我可以在客户端完成此操作,但是我认为最好在SQL方面完成操作。
这是我现有的有关交错问题的查询。它也不是性能最高的,但是我看不到将CTE推到视图的方法,因为该视图没有日期过滤器,这在此表上非常有帮助。
-- Current SQL query, not very fast
WITH version_ranked AS (
SELECT
tv.timeseries_id
, tv.value_number
, tv.value_time
, tv.version_time
, RANK() over (PARTITION BY tv.timeseries_id, tv.value_time ORDER BY tv.version_time DESC) AS rn
FROM timeseries_values AS tv
WHERE tv.timeseries_id = @id
AND version_time > @time_filter
)
SELECT
*
FROM version_ranked AS vr
WHEREv r.rn = 1
-- Sample table, with an extra row between versions
| timeseries_id (int) | value_number (numeric) | value_time (datetime) | version_time (datetime) |
| 1 | 30 | '2019-03-27 00:03:00' | '2019-03-26 00:00:00' |
| 1 | 20 | '2019-03-27 00:02:00' | '2019-03-26 00:00:00' |
| 1 | 10 | '2019-03-27 00:01:00' | '2019-03-26 00:00:00' |
| 1 | 3 | '2019-03-27 00:01:30' | '2019-03-25 00:00:00' |
| 1 | 2 | '2019-03-27 00:01:00' | '2019-03-25 00:00:00' |
| 1 | 1 | '2019-03-27 00:00:30' | '2019-03-25 00:00:00' |
-- What I get with above code, interleaving the versions
| timeseries_id (int) | value_number (numeric) | value_time (datetime) | version_time (datetime) |
| 1 | 30 | '2019-03-27 00:03:00' | '2019-03-26 00:00:00' |
| 1 | 20 | '2019-03-27 00:02:00' | '2019-03-26 00:00:00' |
| 1 | 3 | '2019-03-27 00:01:30' | '2019-03-25 00:00:00' |
| 1 | 10 | '2019-03-27 00:01:00' | '2019-03-26 00:00:00' |
| 1 | 1 | '2019-03-27 00:00:30' | '2019-03-25 00:00:00' |
--What I want in the end
| timeseries_id (int) | value_number (numeric) | value_time (datetime) | version_time (datetime) |
| 1 | 30 | '2019-03-27 00:03:00' | '2019-03-26 00:00:00' |
| 1 | 20 | '2019-03-27 00:02:00' | '2019-03-26 00:00:00' |
| 1 | 10 | '2019-03-27 00:01:00' | '2019-03-26 00:00:00' |
| 1 | 1 | '2019-03-27 00:00:30' | '2019-03-25 00:00:00' |
答案 0 :(得分:0)
看起来我将不得不利用一些客户端并进行两个查询:一个用于获取版本中的最大和最小时间戳,使用客户端进行一个新查询,该查询包含几个简单的SELECT UNION语句,然后发出。