通过使用上一行的值和下一行的值来计算平均值

时间:2019-09-15 11:10:29

标签: sql sql-server

我已经计算出每个月的平均值。某些月份是NULL,我的经理希望我使用上一行的值和下一个月份的值,并填写具有NULL值的月份。

当前结果(见下图):

预期结果

DECLARE @DATE DATE = '2017-01-01';
WITH DATEDIM AS
(             
  SELECT DISTINCT DTM.FirstDayOfMonth 
  FROM DATEDIM DTM 
  WHERE Date >= '01/01/2017'
  AND Date <= DATEADD(mm,-1,Getdate())
), 
Tab1 AS
(
  SELECT 
    T1.FirstDayOfMonth AS MONTH_START,
    AVG1, 
    ROW_NUMBER() OVER (
      ORDER BY DATEADD(MM,DATEDIFF(MM, 0, T1.FirstDayOfMonth),0) DESC
    ) AS RNK 
  FROM DATEDIM T1 
  LEFT OUTER JOIN (
    SELECT 
      DATEADD(MM,DATEDIFF(MM, 0, StartDate),0) MONTH_START, 
      AVG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) AS AVG1
    FROM DATATable
    WHERE EndDate >= StartDate
    AND StartDate >= @DATE
    AND EndDate >= @DATE
    GROUP BY DATEADD(MM,DATEDIFF(MM, 0, StartDate),0)
  ) T2 ON T1.FirstDayOfMonth = T2.MONTH_START
)
SELECT * 
FROM Tab1

4 个答案:

答案 0 :(得分:1)

使用CTE

select MONTH_START,
    case when AVG1 is null then
       (select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK > t2.RNK and t2.AVG1 is not null
        order by t2.RNK desc)
    else AVG1 end AVG1,
    RNK 
from Tab1 t1

修改

版本,用于平均最近的peering和最近的非null。两者都必须存在,否则返回NULL。

select MONTH_START,
    case when AVG1 is null then
     ( (select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK > t2.RNK and t2.AVG1 is not null
        order by t2.RNK desc)
       +(select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK < t2.RNK and t2.AVG1 is not null
        order by t2.RNK)
      ) / 2
    else AVG1 end AVG1,
    RNK 
from Tab1 t1

答案 1 :(得分:0)

您可以使用窗口函数来操纵上一行和下一行的值:

SELECT MAX(row_value) OVER(
  ORDER BY ... ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS Previous_Value,
MAX(row_value) OVER(
  ORDER BY ... ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS Next_Value

或者,您可以使用LAG/LEAD函数并修改子查询,以获取AVG

SELECT 
  src.MONTH_START, 
  CASE 
    WHEN src.prev_val IS NULL OR src.next_val IS NULL 
      THEN COALESCE(src.prev_val, src.next_val) -- Return non-NULL value (if exists)
    ELSE (src.prev_val + src.next_val ) / 2
  END AS AVG_new
FROM (
  SELECT 
    DATEADD(MM,DATEDIFF(MM, 0, StartDate),0) MONTH_START, 
    LEAD(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) OVER(ORDER BY ...) AS prev_val,
    LAG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) OVER(ORDER BY ...) AS next_val
  -- AVG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) AS AVG1
  FROM DATATable
  WHERE EndDate >= StartDate
  AND StartDate >= @DATE
  AND EndDate >= @DATE
  GROUP BY DATEADD(MM,DATEDIFF(MM, 0, StartDate),0)
) AS src

我还没有测试过,但是试一试,看看它是如何工作的。您可能需要在窗口函数的ORDER BY部分中至少放置一列。

答案 2 :(得分:0)

我不太清楚您要计算的平均值,但是使用窗口函数非常简单:

select t.*,
       avg(val) over (order by month_start rows between 1 preceding and 1 rollowing)
from t;

在您的情况下,我认为的意思是:

select datefromparts(year(startdate), month(startdate), 1) as float,
       avg(val) as monthaverage,
       avg(avg(val)) over (order by min(startdate) rows between 1 preceding and 1 following)
from datatable d
where . . .
group by datefromparts(year(startdate), month(startdate), 1)

答案 3 :(得分:0)

您可以尝试此查询(我只是在示例数据中反映了相关部分,省略了日期列):

declare @tbl table (rank int, value int);
insert into @tbl values
(1, null),
(2, 20),
(3, 30),
(4, null),
(5, null),
(6, null),
(7, 40),
(8, null),
(9, null),
(10, 36),
(11, 22);

;with cte as (
    select *,
           DENSE_RANK() over (order by case when value is null then rank else value end) drank,
           case when value is null then lag(value) over (order by rank) end lag,
           case when value is null then lead(value) over (order by rank) end lead
    from @tbl
)

select rank, value, case when value is null then  
         max(lag) over (partition by grp) / 2 +
         max(lead) over (partition by grp) / 2
       else value end valueWithAvg
from (
    select *,
           rank - drank grp from cte
) a order by rank