Question

如果缺少该位置一个月的数据，我想做的就是重复上一行。在添加一栏时会添加缺少该月的日期。例如，如果缺少9月，则会将9/1/2018放在新列中。

所以，我有一些背景知识，涉及大量不同地点的收款信息，这些资料大部分是每月收集的。但是有时某个月没有收集，在这种情况下，我们想复制上个月缺少月份的数据。我以为我可以通过创建Month Diff列来找到错过补给的月份。然后我只需要每月插入一列差异。因此，如果相差3个月，那么我将插入三行新记录，并增加一列，为它们提供该月的日期。

这是我到目前为止的代码，但是我坚持添加行，并且不确定是否可行。

Select  
  Location_ID, 
  Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101)) as Collect_Date, 
  Calc_Gross_Totals, 
  Loc_Country, 
  CONVERT(varchar(8),Collect_Month_Key)+'-'+Location_ID as [Unique Key],
  MONTH(Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101))) as MONTH, 
  YEAR(Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101))) as 'YEAR',
  ROW_NUMBER() OVER(PARTITION BY Location_ID+'-'+left(Collect_Month_Key,4) ORDER BY Collect_Month_Key ASC)  as 'INDEX',
Cast(
    Case 
        when MONTH(Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101))) > ROW_NUMBER() OVER(PARTITION BY Location_ID+'-'+left(Collect_Month_Key,4) ORDER BY Collect_Month_Key ASC) 
        Then MONTH(Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101))) - ROW_NUMBER() OVER(PARTITION BY Location_ID+'-'+left(Collect_Month_Key,4) ORDER BY Collect_Month_Key ASC) 
        Else 0 
    End as bit) as 'Month Diff'
From FT_GPM_NPM_CYCLES AS cyc
INNER JOIN LU_Location AS loc ON cyc.Lu_Loc_Key = loc.LU_Loc_Key
INNER JOIN LU_Loc_Country AS cty ON loc.LU_Loc_Country_Key = cty.LU_Loc_Country_Key
Where 
  Collect_Month_Key <> -1 and 
  Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101)) >= '2016-1-1'
Order By 
  Location_ID, 
Collect_Date;`

我的输出随附为图像，以了解其外观。

Picture of my query output

Answer 1

首先，提示要获取完整的月份列表-您可以为此使用递归CTE，请参见示例顶部。 2018年是第一个月，2020-01-01是您不想希望在图表上看到的第一个月。

第二个CTE是虚拟的“实际报告数据，只是存在的月份”。暂时跳过。

一旦您有了完整的月份列表，请使用类似于以下示例的datediff条件，以帮助您将此维与数据表连接起来。在此阶段，您只需要报告日期，而不是数据列。

现在，您可以使用窗口函数来确定哪个月份会有外部联接缺失的数据（请参见下面的MAX ... OVER ...子句）。请注意，这使用了此窗口函数的默认行为-上一行的MAX。

接下来，您只需要重新加入原始数据，包括数据列。

示例：

WITH all_months(monthStart) AS (
    SELECT CAST('2018-01-01' AS date)
     UNION ALL
     SELECT DATEADD(month, 1, monthStart)
     FROM all_months
     WHERE DATEADD(month, 1, monthStart) < '2020-01-01'
    )
, cte_data as (
    select /* LocationID, */ DATEADD(day, 3, monthStart) as reportDate, 'zum-zum_data_'+CAST(monthStart as varchar) as actual_data
from all_months
WHERE datediff(month, SYSDATETIME(), monthStart) %3 =0 
)
, cte_data_join as (
select /* LocationID, */ monthStart, reportDate 
from all_months
    LEFT JOIN cte_data ON (datediff(month, cte_data.reportDate, all_months.monthStart )= 0) 
)

, cte_month_source as (
select *, max(report_date) 
            over (/* PARTITION BY LocationID*/ order by monthStart) as source_date
from cte_data_join
)

select /* LocationID, */ cte_month_source.monthStart as reporting_month, source_date as report_data_date, actual_data
from cte_month_source 
join cte_data ON (cte_month_source.source_date = cte_data.reportDate) 
ORDER BY monthStart

Answer 2

我简化了您的示例，将重点放在收集日期和审计年度/月份。收款日期明细是在不存在审计年份/月份时重复的内容。

begin
    -- simplified table
    create table    #collect    (
                                    Coll_Dt     date
                                    ,Val        int
                                    ,Aud_Yr     int
                                    ,Aud_Mth    int
                                )

    -- adding data
    insert into #collect
    values       ('2018-01-01',1,2018,1)
                ,('2018-02-01',2,2018,2)
                ,('2018-03-01',3,2018,3)
                ,('2018-05-01',5,2018,5)
                ,('2018-06-01',6,2018,6)
                ,('2018-07-01',7,2018,7)
                ,('2018-08-01',8,2018,8)
                ,('2018-12-01',12,2018,12)


    -- adding row number to determine where listing starts
    select      row_number() over (order by aud_yr,aud_mth) pid
                ,*
    into        #wrk
    from        #collect

end


declare @i      int = 1
        ,@i2    int
        ,@max   int = (select max(pid) from #wrk)
        ,@diff  int
        ,@rows  int

while @i <= @max
begin

    -- if @i = 1 then it is the first record and there's nothing to compare to
    if @i > 1
    begin
        -- determining the difference between current and prior collections
        select      @diff = datediff(month,b.coll_dt,a.coll_dt)
        from        #wrk    a
        outer apply (
                        select      top 1
                                    *
                        from        #wrk    b
                        where       b.Coll_Dt < a.Coll_Dt
                        order by    b.Coll_Dt desc
                    )       b
        where       a.pid = @i

        if @diff > 1
        begin
            -- number of rows to be added
            set @rows = @diff - 1

            -- resetting incrementor
            set @i2 = 1

            -- adding new rows
            while @i2 <= @rows
            begin
                insert into #collect
                select      Coll_Dt
                            ,Val
                            ,year(dateadd(month,@i2 * -1,coll_dt))
                            ,month(dateadd(month,@i2 * -1,coll_dt))
                from        #wrk
                where       pid = @i


                -- incrementing to exit loop and add additional rows, if more than 1 row is needed
                set @i2 = @i2 + 1
            end

        end

    end

    -- incrementing loop
    set @i = @i +1
end

select * from #collect order by aud_yr, aud_Mth


-- cleaning db
drop table  #collect
            ,#wrk

在表格中插入与上一行（月份）重复的行

2 个答案: