SQL:按方向过滤行

时间:2018-11-05 19:37:18

标签: sql postgresql where gaps-and-islands

我有一个表,其中有2列日期(时间戳),状态(布尔值)。 我有很多价值,例如:

| date                      | status    |
|-------------------------- |--------   |
| 2018-11-05T19:04:21.125Z  | true      |
| 2018-11-05T19:04:22.125Z  | true      |
| 2018-11-05T19:04:23.125Z  | true      |
....

我需要得到这样的结果:

| date_from                 | date_to                   | status    |
|-------------------------- |-------------------------- |--------   |
| 2018-11-05T19:04:21.125Z  | 2018-11-05T19:04:27.125Z  | true      |
| 2018-11-05T19:04:27.125Z  | 2018-11-05T19:04:47.125Z  | false     |
| 2018-11-05T19:04:47.125Z  | 2018-11-05T19:04:57.125Z  | true      |

因此,我需要过滤所有“相同”的值,并只返回状态为真/假的时段。

我创建这样的查询:

SELECT max("current_date"), current_status, previous_status
FROM (SELECT date as "current_date",
             status as current_status,
             (lag(status, 1) OVER (ORDER BY msgtime))::boolean AS previous_status
      FROM "table" as table
      ) as raw_data
group by current_status, previous_status

但是作为回应,我得到的值不超过4

2 个答案:

答案 0 :(得分:3)

这是一个孤岛问题。一种典型的方法是使用行号的不同之处:

select min(date), max(date), status
from (select t.*,
             row_number() over (order by date) as seqnum,
             row_number() over (partition by status order by date) as seqnum_s
      from t
     ) t
group by status, (seqnum - seqnum_s);

答案 1 :(得分:1)

是的,您可以使用LAG,但是您还需要一个运行计数器,该计数器每次状态改变时都会递增:

WITH cte1 AS (
    SELECT date, status, CASE WHEN LAG(status) OVER (ORDER BY date) = status THEN 0 ELSE 1 END AS chg
    FROM yourdata
), cte2 AS (
    SELECT date, status, SUM(chg) OVER (ORDER BY date) AS grp
    FROM cte1
)
SELECT MIN(date) AS date_from, MAX(date) AS date_to, status
FROM cte2
GROUP BY grp, status
ORDER BY date_from

DB Fiddle