在不含日历表的SQL中提取两个日期之间的所有日期

时间:2018-08-15 08:46:57

标签: sql sql-server

我有这样一张桌子:

CREATE table #tableTest
        (
        ID int,
        SumVisits int,
        Domain nvarchar(255),
        LoadDate int
        )

insert into #tableTest (ID,SumVisits ,Domain,LoadDate) values (1,67,'cnn.com',20180617),(2,58,'cnn.com',20180624),(3,52,'cnn.com',20180701)
select * from #tableTest order by LoadDate

我想拥有这样的结构:

  | SumVisits | date
1   67       20180617
2   67       20180618
3   67       20180619
4   67       20180620
5   67       20180621
6   67       20180622
7   67       20180623
8   58       20180624
9   58       20180625
10  58       20180626
11  58       20180627
12  58       20180628
13  58       20180629
14  58       20180630
15  52       20180701
...

我的第一个想法是使用递归CTE:

;WITH GeneratedCalendar AS
(
SELECT
        CAST(convert(nvarchar(255),[LoadDate]) as date) as EndDate
       ,lead(cast(convert(nvarchar(255),[LoadDate]) as date) , 1,NULL) OVER(PARTITION BY [domain] order by [LoadDate] desc) as StartDate
      From Table
      UNION ALL
      SELECT

        EndDate
        ,StartDate = DATEADD(DAY, 1, G.StartDate)
      FROM
        GeneratedCalendar AS G
      WHERE
        G.StartDate < EndDate
)
select *  from GeneratedCalendar

但是实际上,使用此sql代码,我无法生成所需的结构。你对我有什么想法吗?

2 个答案:

答案 0 :(得分:3)

我相信这可以满足您的需求。当显然是日期时,将您的列LoadDate存储为int意味着我不得不进行很多转换。将日期保存为date

我使用了Tally,而不是使用递归CTE。 rCTE可能会在这里使用RBAR,如果您有大型数据集,则RBAR会慢得多。 Tally不是RBAR,因此可扩展性更好。我使用过的Tally最多可以使用10,000天(即27年),可以满足您的需求(我本来可以使用1000天,但这只是几年,可能无法满足要求) )。

USE Sandbox;
GO

CREATE TABLE #tableTest (ID int,
                         SUMV int,
                         Domain nvarchar(255),
                         LoadDate int --Why is this a int????
);

INSERT INTO #tableTest (ID,
                        SUMV,
                        Domain,
                        LoadDate)
VALUES (1, 67, 'cnn.com', 20180617),
       (2, 58, 'cnn.com', 20180624),
       (3, 52, 'cnn.com', 20180701);
SELECT *
FROM #tableTest
ORDER BY LoadDate;

GO

WITH N AS
    (SELECT *
     FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) V (N)),
Tally AS
    (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1 AS I
     FROM N N1
          CROSS JOIN N N2
          CROSS JOIN N N3
          CROSS JOIN N N4),
DateTally AS
    (SELECT CONVERT(int,CONVERT(varchar(8), DATEADD(DAY, T.I, TT.MinDate), 112)) AS DateValue
     FROM Tally T
          CROSS JOIN (SELECT MIN(CONVERT(date, CONVERT(varchar(8), LoadDate))) AS MinDate,
                             MAX(CONVERT(date, CONVERT(varchar(8), LoadDate))) AS MaxDate
                      FROM #tableTest) TT
     WHERE DATEADD(DAY, T.I, TT.MinDate) <= TT.MaxDate)
SELECT TT.ID,
       TT.SUMV,
       DT.DateValue
FROM DateTally DT
     CROSS APPLY (SELECT TOP 1
                         *
                  FROM #tableTest TT
                  WHERE TT.LoadDate <= DT.DateValue
                  ORDER BY TT.LoadDate DESC) TT;

GO
DROP TABLE #tableTest;

答案 1 :(得分:1)

我将生成日期,然后输入值:

with dates as (
      select min(cast(convert(nvarchar(255), [LoadDate]) as date)) as dte,
                max(cast(convert(nvarchar(255), [LoadDate]) as date)) as lastdate,
      from #tableTest t
      union all
      select dateadd(day, 1, dte), lastdate
      from dates
      where dte < lastdate
     )

然后引入其余数据。如果数字在减少:

select d.dte, min(t.sumvisits) over (order by d.dte)
from dates left join
     #tableTest t
     on dates.dte = cast(convert(nvarchar(255), [LoadDate]) as date);

您可能不会很幸运地知道数字在增加或减少。一种方法是:

select d.dte, t.sumvisits
from dates outer apply
     (select top (1) t.*
      from #tableTest t
      where dates.dte <= cast(convert(nvarchar(255), [LoadDate]) as date)
      order by t.loaddate desc
     ) t