如何获取时间序列中状态更改的日期?

时间:2021-05-27 13:28:26

标签: sql-server tsql

我有以下问题:我有机器生命周期事件的时间表:

DROP TABLE IF EXISTS #machineStatus

CREATE TABLE #machineStatus
(
    machineID VARCHAR(255),
    machineStatus VARCHAR(255),
    statusDate DATETIME
)

INSERT INTO #machineStatus (machineId, statusDate, machineStatus)
VALUES
('01255999', '2019-11-01',  '1 - InStorage'),
('01255999', '2019-12-01',  '1 - InStorage'),
('01255999', '2020-01-01',  '1 - InStorage'),
('01255999', '2020-02-01',  '1 - InStorage'),
('01255999', '2020-03-01',  '1 - InStorage'),
('01255999', '2020-04-01',  '1 - InStorage'),
('01255999', '2020-05-01',  '1 - InStorage'),
('01255999', '2020-06-01',  '1 - InStorage'),
('01255999', '2020-07-01',  '1 - InStorage'),
('01255999', '2020-08-01',  '1 - InStorage'),
('01255999', '2020-09-01',  '1 - InStorage'),
('01255999', '2020-11-01',  '1 - InStorage'),
('01255999', '2020-12-01',  '1 - InStorage'),
('01255999', '2020-12-15',  '1 - InStorage'),
('01255999', '2021-01-01',  '2 - RentedOut'),
('01255999', '2021-03-01',  '1 - InStorage'),
('01255999', '2021-04-01',  '1 - InStorage'),
('01255999', '2021-04-02',  '2 - RentedOut'),
('01255999', '2021-04-05',  '3 - Service'),
('01255999', '2021-04-15',  '4 - Repairs'),
('01255999', '2021-04-20',  '2 - RentedOut'),
('01255999', '2021-05-27',  '5 - Sold')

我需要创建一个新列,我必须在其中显示状态更改的最后日期:

SELECT
    s.*,
    (SELECT MAX(ss.statusDate) 
     FROM #machineStatus ss 
     WHERE ss.machineId = s.machineId 
       AND ss.machineStatus <> s.machineStatus 
       AND ss.statusDate < s.statusDate) AS statusChangeDate
FROM #machineStatus s
ORDER BY s.statusDate

Run with SQLFiddle

Output

我似乎不太明白,但我的问题是我不知道如何获得机器的第一个/最早状态的日期。 statusChangeDate 列中的所有 NULL 值应为 2019-11-01,如下所示:

|machineID  | machineStatus |           statusDate |     statusChangeDate |
|-----------|-------------- |----------------------|----------------------|
|  01255999 | 1 - InStorage | 2019-11-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2019-12-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-01-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-02-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-03-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-04-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-05-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-06-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-07-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-08-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-09-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-11-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-12-01T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2020-12-15T00:00:00Z | 2019-11-01T00:00:00Z |
|  01255999 | 2 - RentedOut | 2021-01-01T00:00:00Z | 2020-12-15T00:00:00Z |
|  01255999 | 1 - InStorage | 2021-03-01T00:00:00Z | 2021-01-01T00:00:00Z |
|  01255999 | 1 - InStorage | 2021-04-01T00:00:00Z | 2021-01-01T00:00:00Z |
|  01255999 | 2 - RentedOut | 2021-04-02T00:00:00Z | 2021-04-01T00:00:00Z |
|  01255999 | 3 - Service   | 2021-04-05T00:00:00Z | 2021-04-02T00:00:00Z |
|  01255999 | 4 - Repairs   | 2021-04-15T00:00:00Z | 2021-04-05T00:00:00Z |
|  01255999 | 2 - RentedOut | 2021-04-20T00:00:00Z | 2021-04-15T00:00:00Z |
|  01255999 | 5 - Sold      | 2021-05-27T00:00:00Z | 2021-04-20T00:00:00Z |

感谢任何帮助。谢谢!! :)

3 个答案:

答案 0 :(得分:0)

分两步做。

首先,使用LAG() OVER ()检查状态是否发生变化,并记录状态发生变化的日期。

然后,使用 MAX() OVER () 将这些日期向前传播以填充 NULL (在状态未更改的行上)

WITH
  check_for_changes AS
(
  SELECT
    *,
    CASE WHEN LAG(machineStatus) OVER (PARTITION BY machineID ORDER BY statusDate) = machineStatus THEN NULL ELSE statusDate END  statusChangeDate
  FROM
    machineStatus
)
SELECT
  *,
  MAX(statusChangeDate) OVER (PARTITION BY machineID ORDER BY statusDate)   AS lastStatusChangeDate
FROM
  check_for_changes
ORDER BY
  statusDate

http://sqlfiddle.com/#!18/195e8/1

答案 1 :(得分:0)

一种方法是计算一个组号并从前一组中获取最后一个日期。

with g as(
  select machineId, statusDate,  machineStatus, sum(flag) over(partition by machineId order by statusDate) grp
  from (
    select *, case lag(machineStatus, 1, machineStatus) over(partition by machineId order by statusDate) when machineStatus then 0 else 1 end flag
    from #machineStatus) s
) 
select machineId, statusDate, machineStatus
    , (select top(1) g2.statusDate 
       from g g2 
       where g2.machineId = g1.machineId and g2.grp < g1.grp 
       order by g2.statusDate desc) lastChange
from g g1
order by statusDate

db<>fiddle

答案 2 :(得分:0)

你用 COALESCE 去掉 NULL

SELECT  s.*,
        COALESCE((
            SELECT  MAX(ss.statusDate)
            FROM    #machineStatus ss
            WHERE ss.machineID = s.machineID
                AND ss.machineStatus <> s.machineStatus
                AND ss.statusDate < s.statusDate
        ),
        (
            SELECT  MIN(ss.statusDate)
            FROM    #machineStatus ss
            WHERE ss.machineID = s.machineID
                AND ss.machineStatus = s.machineStatus
                AND ss.statusDate <= s.statusDate
        ))  AS statusChangeDate
FROM    #machineStatus s
ORDER BY s.statusDate;
相关问题