在PostgreSQL中,我试图找到具有低于60的值序列,然后是随后出现的高于60的两个连续值的主题。我也对第一个记录的值低于60的时间和第二个记录的值高于60的时间之间的时间长度感兴趣。此事件对于每个主题可能发生多次。
我正在努力寻找如何搜索数量不限的<60,然后再搜索2个> = 60的值。
RowID SubjectID Value TimeStamp
1 1 65 2142-04-29 12:00:00
2 1 58 2142-04-30 03:00:00
3 1 55 2142-04-30 04:00:00
4 1 54 2142-04-30 05:00:00
5 1 55 2142-04-30 06:15:00
6 1 56 2142-04-30 06:45:00
7 1 65 2142-04-30 07:00:00
8 1 65 2142-04-30 08:00:00
9 2 48 2142-05-04 03:30:00
10 2 48 2142-05-04 04:00:00
11 2 50 2142-05-04 05:00:00
12 2 69 2142-05-04 06:00:00
13 2 68 2142-05-04 07:00:00
14 2 69 2142-05-04 08:00:00
15 2 50 2142-05-04 09:00:00
16 2 55 2142-05-04 10:00:00
17 2 50 2142-05-04 10:30:00
18 2 67 2142-05-04 11:00:00
19 2 67 2142-05-04 12:00:00
我目前的尝试使用滞后和超前功能,但是当我不确定需要向前走多远时,我不确定如何使用这些功能。这是一个前瞻一个价值而后一个价值的例子。我的问题是我不知道如何按subjectID
进行划分,以期提前查看“ t”个时间点,其中每个主题的“ t”可能不同。
select t.subjectId, t.didEventOccur,
(next_timestamp - timestamp) as duration
from (select t.*,
lag(t.value) over (partition by t.subjectid order by t.timestamp)
as prev_value,
lead(t.value) over (partition by t.subjectid order by
t.timestamp) as next_value,
lead(t.timestamp) over (partition by t.subjectid order by
t.timestamp) as next_timestamp
from t
) t
where value < 60 and next_value < 60 and
(prev_value is null or prev_value >= 60);
我希望得到如下输出:
SubjectID DidEventOccur Duration
1 1 05:00:00
2 1 03:30:00
2 1 03:00:00
答案 0 :(得分:0)
一个纯粹的 SQL 解决方案,就像您一直在要求的那样:
SELECT subjectid, start_at, next_end_at - start_at AS duration
FROM (
SELECT *
, lead(end_at) OVER (PARTITION BY subjectid ORDER BY start_at) AS next_end_at
FROM (
SELECT subjectid, grp, big
, min(ts) AS start_at
, max(ts) FILTER (WHERE big AND big_rn = 2) AS end_at -- 2nd timestamp
FROM (
SELECT subjectid, ts, grp, big
, row_number() OVER (PARTITION BY subjectid, grp, big ORDER BY ts) AS big_rn
FROM (
SELECT subjectid, ts
, row_number() OVER (PARTITION BY subjectid ORDER BY ts)
- row_number() OVER (PARTITION BY subjectid, (value > 60) ORDER BY ts) AS grp
, (value > 60) AS big
FROM tbl
) sub1
) sub2
GROUP BY subjectid, grp, big
) sub3
) sub4
WHERE NOT big -- identifies block of values <= 60 ...
AND next_end_at IS NOT NULL -- ...followed by at least 2 values > 60
ORDER BY subjectid, start_at;
我省略了无用的列DidEventOccur
,而添加了start_at
。否则,完全您想要的结果。
db <>提琴here
请改用plpgsql(或任何PL)中的过程解决方案,应该更快。更简单?我会说是的,但这取决于谁来判断。请参阅(有说明以了解该技术和更多链接):