将快照存储到TimeSeries数据中

时间:2019-11-01 13:23:22

标签: sql sql-server tsql

我每天都有数据快照。现在,我想使用SQL从中获取时间序列数据。我尝试了一些方法,但是有一定的局限性。

样本数据:

enter image description here

预期结果:

enter image description here

我尝试了以下SQL,但局限性在于,在逻辑上应为值0创建两个分区,而仅创建一个分区时,它给出False结果。

SELECT [name], [value],
[date] as [start],
DATEADD(DAY, -1, LEAD([date], 1) OVER(PARTITION BY [name] ORDER BY [date])) AS [end]
FROM (
    SELECT *,
    RANK() OVER(Partition by [name], [rnk] ORDER BY [date]) as row_num 
    FROM(
        SELECT [name], [value], [date],
        DENSE_RANK() OVER(Partition by [name] ORDER BY [value]) AS rnk
        FROM sample_data
    ) AS T
) AS TT
WHERE row_num = 1

上述SQL的结果:

enter image description here

我们非常感谢您的帮助!

3 个答案:

答案 0 :(得分:2)

这是一个gaps-and-islands问题。你可以试试看。

SELECT Name, Value, MIN([Date]) Start, MAX([Date]) [End] FROM (
    SELECT *, 
        ROW_NUMBER() OVER(PARTITION BY Name  ORDER BY [Date]) 
        - ROW_NUMBER() OVER(PARTITION BY Name, Value ORDER BY [Date]) AS GRP
    FROM sample_data
) T 
GROUP BY Name, Value, GRP
ORDER BY Name, Start

答案 1 :(得分:1)

SQL Fiddle

MS SQL Server 2017架构设置

create table sample_data(Name varchar(max), Value int , Date date)
insert into sample_data(Name,Value,Date)values('A',0,'2019-10-24')
insert into sample_data(Name,Value,Date)values('A',0,'2019-10-25')
insert into sample_data(Name,Value,Date)values('A',0,'2019-10-26')
insert into sample_data(Name,Value,Date)values('A',1,'2019-10-27')
insert into sample_data(Name,Value,Date)values('A',1,'2019-10-28')
insert into sample_data(Name,Value,Date)values('A',1,'2019-10-29')
insert into sample_data(Name,Value,Date)values('A',0,'2019-10-30')
insert into sample_data(Name,Value,Date)values('A',0,'2019-10-31')

查询1

WITH CTE AS (
  SELECT *, 
        ROW_NUMBER() OVER(PARTITION BY Name  ORDER BY Date ) 
        - ROW_NUMBER() OVER(PARTITION BY Name,Value ORDER BY Date  ) AS Interval
    FROM sample_data
  )


SELECT Name, Value, MIN(Date) Starting_Date, MAX(Date) Ending_Date FROM CTE    
GROUP BY Name, Value, Interval
Order BY Name,Starting_Date

Results

| Name | Value | Starting_Date | Ending_Date |
|------|-------|---------------|-------------|
|    A |     0 |    2019-10-24 |  2019-10-26 |
|    A |     1 |    2019-10-27 |  2019-10-29 |
|    A |     0 |    2019-10-30 |  2019-10-31 |

答案 2 :(得分:1)

这是针对称为孤岛和缺口的算法的解决方案。

;WITH [Islands] AS 
(
    SELECT 'A' AS [Name], 0 AS [Value], CAST('2019-10-24' AS DATE) AS [Date] UNION
    SELECT 'A' AS [Name], 0 AS [Value], CAST('2019-10-25' AS DATE) AS [Date] UNION
    SELECT 'A' AS [Name], 0 AS [Value], CAST('2019-10-26' AS DATE) AS [Date] UNION

    SELECT 'A' AS [Name], 1 AS [Value], CAST('2019-10-27' AS DATE) AS [Date] UNION
    SELECT 'A' AS [Name], 1 AS [Value], CAST('2019-10-28' AS DATE) AS [Date] UNION
    SELECT 'A' AS [Name], 1 AS [Value], CAST('2019-10-29' AS DATE) AS [Date] UNION

    SELECT 'A' AS [Name], 0 AS [Value], CAST('2019-10-30' AS DATE) AS [Date] UNION
    SELECT 'A' AS [Name], 0 AS [Value], CAST('2019-10-31' AS DATE) AS [Date]
)
, [IslandGroups] AS 
(
    SELECT
        *
        ,DATEDIFF(DAY, '1900-01-01', [Date]) AS [DifferenceInDays]
        ,ROW_NUMBER() OVER (ORDER BY [Name], [Value]) AS [RowNumber]
        ,DATEDIFF(DAY, '1900-01-01', [Date]) - ROW_NUMBER() OVER (ORDER BY [Name], [Value]) AS [IslandGroup]
    FROM
        [Islands]
)
SELECT
    [Name]
    ,[Value]
    ,MIN([Date]) AS [starting_date]
    ,MAX([Date]) AS [starting_date]
FROM
    [IslandGroups]
GROUP BY
    [Name]
    ,[Value]
    ,[IslandGroup]
ORDER BY
    [Name]
    ,MIN([Date])

这是它的工作方式。该算法通过从两个日期之间的天差中减去排名函数(在本例中为ROW_NUMBER())来工作。如果运行此命令,则将看到RowNumber列随着DifferenceInDays的增加而增加。

... removed for brevity
, [IslandGroups] AS 
(
    SELECT
        *
        ,DATEDIFF(DAY, '1900-01-01', [Date]) AS [DifferenceInDays]
        ,ROW_NUMBER() OVER (ORDER BY [Name], [Value]) AS [RowNumber]
        ,DATEDIFF(DAY, '1900-01-01', [Date]) - ROW_NUMBER() OVER (ORDER BY [Name], [Value]) AS [IslandGroup]
    FROM
        [Islands]
)
SELECT
    *
FROM
    [IslandGroups]

结果:

A   0   2019-10-24  43760   1   43759 <- First in the series
A   0   2019-10-25  43761   2   43759
A   0   2019-10-26  43762   3   43759
A   0   2019-10-30  43766   4   43762 <- Next set
A   0   2019-10-31  43767   5   43762
A   1   2019-10-27  43763   6   43757 <- Next set
A   1   2019-10-28  43764   7   43757
A   1   2019-10-29  43765   8   43757

然后,您可以按通用的Island分组进行GROUP BY,并从同一组中获得MIN()和MAX()[日期]。