填补日期空白和内插值

时间:2019-07-31 14:16:22

标签: sql-server

我有一个非常简单的表,其中包含SQL Server表中该日期的日期(以天为单位),设备名称和引擎小时数(累计)。原始数据表显示日期之间存在间隔。我需要填补空白并进行插值以为这些新行提供小时值。 “所需结果”表显示了最终产品的外观。

我最初的想法是创建一个“日期”表(递归函数),并使用左联接创建完整的表,但是在这一阶段,我无法用内插数据填充小时列。有什么想法吗?

原始数据

+------------+-----------+-------+--+--+
| Date       | Equipment | Hours |  |  |
+------------+-----------+-------+--+--+
| 2019/01/01 | EQ1       | 50    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/02 | EQ1       | 67    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/03 | EQ1       | 87    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/04 | EQ1       | 105   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/07 | EQ1       | 150   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/08 | EQ1       | 169   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/09 | EQ1       | 187   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/12 | EQ1       | 247   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/13 | EQ1       | 265   |  |  |
+------------+-----------+-------+--+--+
|            |           |       |  |  |
+------------+-----------+-------+--+--+
| 2019/01/01 | EQ2       | 150   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/02 | EQ2       | 168   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/03 | EQ2       | 187   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/04 | EQ2       | 205   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/05 | EQ2       | 222   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/06 | EQ2       | 239   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/07 | EQ2       | 255   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/10 | EQ2       | 306   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/13 | EQ2       | 357   |  |  |
+------------+-----------+-------+--+--+

所需结果

+------------+-----------+-------+--+--+
| Date       | Equipment | Hours |  |  |
+------------+-----------+-------+--+--+
| 2019/01/01 | EQ1       | 50    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/02 | EQ1       | 67    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/03 | EQ1       | 87    |  |  |
+------------+-----------+-------+--+--+
| 2019/01/04 | EQ1       | 105   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/05 | EQ1       | 120   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/06 | EQ1       | 135   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/07 | EQ1       | 150   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/08 | EQ1       | 169   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/09 | EQ1       | 187   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/10 | EQ1       | 207   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/11 | EQ1       | 227   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/12 | EQ1       | 247   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/13 | EQ1       | 265   |  |  |
+------------+-----------+-------+--+--+
|            |           |       |  |  |
+------------+-----------+-------+--+--+
| 2019/01/01 | EQ2       | 150   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/02 | EQ2       | 168   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/03 | EQ2       | 187   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/04 | EQ2       | 205   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/05 | EQ2       | 222   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/06 | EQ2       | 239   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/07 | EQ2       | 255   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/08 | EQ2       | 272   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/09 | EQ2       | 289   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/10 | EQ2       | 306   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/11 | EQ2       | 323   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/12 | EQ2       | 340   |  |  |
+------------+-----------+-------+--+--+
| 2019/01/13 | EQ2       | 357   |  |  |
+------------+-----------+-------+--+--+

3 个答案:

答案 0 :(得分:1)

您可以尝试此查询。

DECLARE @SampleTable TABLE ( [Date] Date, Equipment VARCHAR(10),  Hours INT)
INSERT INTO @SampleTable VALUES
('2019/01/01','EQ1', 50 ),
('2019/01/02','EQ1', 67 ),
('2019/01/03','EQ1', 87 ),
('2019/01/04','EQ1', 105),
('2019/01/07','EQ1', 150),
('2019/01/08','EQ1', 169),
('2019/01/09','EQ1', 187),
('2019/01/12','EQ1', 247),
('2019/01/13','EQ1', 265),

('2019/01/01','EQ2', 150),
('2019/01/02','EQ2', 168),
('2019/01/03','EQ2', 187),
('2019/01/04','EQ2', 205),
('2019/01/05','EQ2', 222),
('2019/01/06','EQ2', 239),
('2019/01/07','EQ2', 255),
('2019/01/10','EQ2', 306),
('2019/01/13','EQ2', 357)


;WITH CTE AS (
    SELECT MIN([Date]) [Date], Equipment FROM @SampleTable T GROUP BY Equipment 
    UNION ALL
    SELECT DATEADD(DAY,1,CTE.[Date]),  CTE.Equipment FROM CTE 
        WHERE EXISTS( SELECT * FROM @SampleTable T WHERE T.Equipment = CTE.Equipment and DATEADD(DAY,1,CTE.[Date] ) <= T.[Date]  )
)
SELECT  CTE.[Date], CTE.Equipment, 
    X1.Hours +  
        DATEDIFF(DAY, X1.[Date],CTE.[Date]) * 
        CASE WHEN DATEDIFF(DAY, X1.[Date],X2.[Date]) > 0 
            THEN (X2.Hours - X1.Hours ) / DATEDIFF(DAY, X1.[Date], X2.[Date]) 
            ELSE X1.Hours END [Hours]
    FROM CTE
        OUTER APPLY( SELECT TOP 1 * FROM @SampleTable S1 WHERE S1.Equipment = CTE.Equipment and CTE.[Date]  >= S1.[Date] ORDER BY S1.Date DESC) X1
        OUTER APPLY( SELECT TOP 1 * FROM @SampleTable S1 WHERE S1.Equipment = CTE.Equipment and CTE.[Date]  <= S1.[Date] ORDER BY S1.Date ASC ) X2
ORDER BY CTE.Equipment, CTE.[Date]

结果:

Date       Equipment  Hours
---------- ---------- -----------
2019-01-01 EQ1        50
2019-01-02 EQ1        67
2019-01-03 EQ1        87
2019-01-04 EQ1        105
2019-01-05 EQ1        120
2019-01-06 EQ1        135
2019-01-07 EQ1        150
2019-01-08 EQ1        169
2019-01-09 EQ1        187
2019-01-10 EQ1        207
2019-01-11 EQ1        227
2019-01-12 EQ1        247
2019-01-13 EQ1        265

2019-01-01 EQ2        150
2019-01-02 EQ2        168
2019-01-03 EQ2        187
2019-01-04 EQ2        205
2019-01-05 EQ2        222
2019-01-06 EQ2        239
2019-01-07 EQ2        255
2019-01-08 EQ2        272
2019-01-09 EQ2        289
2019-01-10 EQ2        306
2019-01-11 EQ2        323
2019-01-12 EQ2        340
2019-01-13 EQ2        357

答案 1 :(得分:0)

下面是解决您的问题的原型逻辑。

此逻辑假定您具有日期表(可以是表变量,临时表等)。您可以在线找到有关如何创建代码的代码(一种简单的方法:How to create a Calendar table for 100 years in Sql

-- 3. Final result: should return values only for missing days 
SELECT DT.Date, Filterred.Equipment,
    -- Logic: Hours value at the start of the gap + ( number of days between the start and "current" date * average hours change )
    FilterredGaps.[Hours] + DATEDIFF( DAY, FilterredGaps.[Date], DT.[Date] ) * AvgHoursChange
FROM
    -- 2. Filter out consecutive days and calculate Avg Hour Change
    ( SELECT *,
        -- Calculate avg daily change (if you have duplicate dates for a given Equipment, you may get devide by zero errors)
        (( NextHours - Hours ) / DATEDIFF( DAY, [Date], NextDate )) AS AvgHoursChange
    FROM
        -- 1. Find gaps
        ( SELECT *,
            -- Find next date and next hours value
            LEAD( [Date] ) OVER ( PARTITION BY Equipment ORDER BY [Date] ) AS NextDate,
            LEAD( [Hours] ) OVER ( PARTITION BY Equipment ORDER BY [Date] ) AS NextHours,
        FROM EquipmentTable ) AS Gaps
    -- Leave only gaps of more than 1 day
    WHERE DATEADD( DAY, 1, [Date] ) < NextDate ) AS FilterredGaps
        -- Finally join filterred gaps to the dates table to get only missing dates
        INNER JOIN DatesTable AS DT ON FilterredGaps.[Date] < DT.[Date] AND DT.[Date] < FilterredGaps.[Date]

想法取自https://www.mssqltips.com/sqlservertutorial/9130/sql-server-window-functions-gaps-and-islands-problem/,我强烈建议您阅读本文,以熟悉该问题和建议的解决方案。

注意:该代码未经测试

答案 2 :(得分:0)

我将创建一个视图,并使用基于https://github.com/atifaziz/NCrontab/wiki/SQL-Server-Crontab的表值Crontab函数来生成日期/时间序列,然后再对其进行LEFT OUTER JOIN编辑。

计算出的值需要具有值的(当前设备的)先前日期,具有值的(当前设备的)下一个日期,当前行的日期以及上一个和下一个值。我将其实现为2个标量值函数(如果它们没有上一个或下一个,则可能返回NULL)。一个获取前一个/下一个日期(参数:@currentDate和一个BIT @next(否则返回前一个)),一个获取上一个/下一个小时数(相同的参数)。结果还可能是日期和小时数的组合字符串,然后进行解析-最好地衡量效果更好的字符串。如果当前日期具有值,则下一个日期逻辑将返回该日期。

然后创建一个标量值函数,该函数接受这些值并执行这样的计算(验证我没有犯任何错误):

myGapInDays = NextDate - PreviousDate
myHourDiff = NextHours - PreviousHours
myIncrementPerDay (FLOAT) = myHourDiff / myGapInDays
myFactor = CurrentDate - PreviousDate
myResult = PreviousHours + Round(myFactor * myIncrementPerDay)

我希望有帮助。