答案 0 :(得分:9)
在SQL Server 2008中,您可以使用递归CTE。
DECLARE @StartDate DATE, @EndDate DATE
SET @StartDate = '20110106'
SET @EndDate = '20110228';
WITH DateTable AS
(
SELECT Event_id, event_title, event_date, occurs_every
FROM tally_table
UNION ALL
SELECT event_ID, event_title, DATEADD(DAY,occurs_every,event_date), occurs_every
FROM DateTable
WHERE DATEADD(DAY,occurs_every,event_date) BETWEEN @StartDate AND @EndDate
)
SELECT Event_id, event_title, event_date
FROM DateTable
WHERE event_date BETWEEN @StartDate AND @EndDate
ORDER BY event_date
您必须记住按日期范围进行过滤,因此它不会进入无限循环。或使用MAXRECURSION
提示限制结果(默认情况下此值为100)
答案 1 :(得分:4)
首先,请接受我最诚挚的歉意,因为我没有回到这篇文章。我做了一些评论作为序言,并且完全有意发布一个有用的答案,而不仅仅是“圣人建议”,然后真实的生活发生了,我完全忘记了这篇文章。
让我们首先重新审视OP的帖子,建立他说他正在使用的表格并填充一千个事件,就像他说的那样。我将使用高性能“伪光标”使用2015年和2016年的随机开始日期对数据进行现代化,以提供我们需要的“行存在”,而不是While循环或rCTE的RBAR(递归CTE) )。
作为一个侧边栏,我保持2005年的所有内容兼容,因为仍然有很多人使用2005年,并且使用2008+技术没有性能提升。
这是构建测试表的代码。详情见评论。
--====================================================================
-- Presets
--====================================================================
--===== Declare and prepopulate some obviously named variables
DECLARE @StartDate DATETIME
,@EndDate DATETIME
,@Days INT
,@Events INT
,@MaxEventGap INT
;
SELECT @StartDate = '2015-01-01' --Inclusive date
,@EndDate = '2017-01-01' --Exclusive date
,@Days = DATEDIFF(dd,@StartDate,@EndDate)
,@Events = 1000
,@MaxEventGap = 30 --Note that 1 day will be the next day
;
--====================================================================
-- Create the Test Table
--====================================================================
--===== If the test table already exists, drop it to make reruns of
-- this demo easier. I also use a Temp Table so that we don't
-- accidenttly screw up a real table.
IF OBJECT_ID('tempdb..#Events','U') IS NOT NULL
DROP TABLE #Events
;
--===== Build the test table.
-- I'm following what the OP did so that anyone with a case
-- sensitive server won't have a problem.
CREATE TABLE #Events
(
event_ID INT,
event_title NVARCHAR(50),
first_event_date DATETIME,
occurs_every INT
)
;
--====================================================================
-- Populate the Test Table
--====================================================================
--===== Build @Events number of events using the previously defined
-- start date and number of days as limits for the random dates.
-- To make life a little easier, I'm using a CTE with a
-- "pseudo-cursor" to form most of the data and then an
-- external INSERT so that I can name the event after the
-- event_ID.
WITH cteGenData AS
(
SELECT TOP (@Events)
event_ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,first_event_date = DATEADD(dd, ABS(CHECKSUM(NEWID())) % @Days, @StartDate)
,occurs_every = ABS(CHECKSUM(NEWID())) % 30 + 1
FROM sys.all_columns ac1 --Has at least 4000 rows in it for most editions
CROSS JOIN sys.all_columns ac2 --Just in case it doesn't for Express ;-)
)
INSERT INTO #Events
(event_ID, event_title, first_event_date, occurs_every)
SELECT event_ID
,event_title = 'Event #' + CAST(event_id AS VARCHAR(10))
,first_event_date
,occurs_every
FROM cteGenData
;
--===== Let's see the first 10 rows
SELECT TOP 10 *
FROM #Events
ORDER BY event_ID
;
以下是前10行的内容,理解first_even_datet和occurrence_every的值会因为我用来生成约束随机数据的方法而大不相同。
event_ID event_title first_event_date occurs_every
-------- ----------- ----------------------- ------------
1 Event #1 2016-10-12 00:00:00.000 10
2 Event #2 2015-04-25 00:00:00.000 28
3 Event #3 2015-11-08 00:00:00.000 4
4 Event #4 2016-02-16 00:00:00.000 25
5 Event #5 2016-06-11 00:00:00.000 15
6 Event #6 2016-04-29 00:00:00.000 14
7 Event #7 2016-04-16 00:00:00.000 9
8 Event #8 2015-03-29 00:00:00.000 2
9 Event #9 2016-02-14 00:00:00.000 29
10 Event #10 2016-01-23 00:00:00.000 8
可以肯定的是,您需要一个Tally Table来复制OPs实验。这是代码。如果您已经有一个,请出于性能原因确保它具有所需的唯一聚簇索引(通常以PK的形式)。我已经对代码的“伪游标”部分中的行源表进行了现代化,以便不使用已弃用的“syscolumns”视图。
--===== Create a Tally Table with enough sequential numbers
-- for more than 30 years worth of dates.
SELECT TOP 11000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM sys.all_columns sc1
CROSS JOIN sys.all_columns sc2
;
--===== Add the quintessential Unique Clustered Index as the PK.
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
;
我们已经准备好摇滚了。 OP的代码的一部分被论坛吞噬了但我能够通过编辑原始帖子来恢复它。它实际上看起来像这样,除了我改变了“结束日期”以匹配我刚刚生成的数据(这是我做的唯一更改)。由于代码不包含标量或多语句UDF,我还打开了统计信息以尝试解释发生了什么。
这是OP的代码,其中包含所提到的更改。
SET STATISTICS TIME,IO ON
;
SELECT event_id,
event_title,
first_event_date,
DATEADD(dd, occurs_every * ( t.N - 1 ), [first_event_date]) AS Occurrence
FROM #Events
CROSS JOIN dbo.Tally t
WHERE t.N <= DATEDIFF(dd,first_event_date,'2017-03-01') / occurs_every + 1
ORDER BY Occurrence
;
SET STATISTICS TIME,IO OFF
;
以下是运行OP代码的统计信息。对于所有滚动感到抱歉,但它们很长。
(61766 row(s) affected)
Table 'Worktable'. Scan count 4, logical reads 118440, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tally'. Scan count 4, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Events_____________________________________________________________________________________________________________00000000001F'. Scan count 5, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4196 ms, elapsed time = 1751 ms.
显然,这种表现正在制造吸吮声音,即使是While Loop或rCTE也可以击败。问题是什么?
如果您查看下面执行计划中突出显示的箭头,您会发现它包含1,100万实际行,因为非SARGable(SARG =“搜索ARGument”和非SARGable意味着它不能使用索引正确)导致11,000行Tally Table和1,000行#Events表之间的完全CROSS JOIN的标准。那些是ACTUAL行,而不是ESTIMATED行,伙计们。
原因是因为Tally Table的“N”列用在公式中,并且必须扫描整个Tally Table作为#Events表中每一行的结果。这是一个常见的错误,使人们认为Tally Tables产生的代码很慢。
那么,我们该如何解决呢?我们不是使用t.N计算每一行的日期,而是采用日期的差异并除以天数来计算出等于t.N所需的出现次数,看看会发生什么。请注意,我在下面的代码中唯一更改的是WHERE子句中的条件,以便在t.N SARGable上进行查找(能够使用索引来启动和停止搜索,然后进行范围扫描)。
SET STATISTICS TIME,IO ON
;
SELECT event_id,
event_title,
first_event_date,
DATEADD(dd, occurs_every * ( t.N - 1 ), [first_event_date]) AS Occurrence
FROM #Events
CROSS JOIN dbo.Tally t
WHERE t.N <= DATEDIFF(dd,first_event_date,'2017-03-01') / occurs_every + 1
ORDER BY Occurrence
;
SET STATISTICS TIME,IO OFF
;
这是新执行计划的样子。 61,766行实际行(全部在缓存中)与1100万行完全不同。
(61766 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Events_____________________________________________________________________________________________________________00000000001F'. Scan count 5, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tally'. Scan count 1000, logical reads 3011, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 528 ms.
代码总数已更改... WHERE子句的1行。
我们可以使用Itzik Ben-Gan的内联级联CTE(不是rCTE)将读取总数降低到7个。
最重要的是,虽然使用Tally Table几乎是性能的灵丹妙药,但您必须正确使用它,就像其他任何东西一样。您必须使用“最佳实践”,例如编写SARGable WHERE子句以正确地将它提供给我们索引,就像其他任何内容一样。
再次,我最诚挚的道歉,特别是OP,因为这么晚才这么晚。我希望将来可以帮助某人。我也很抱歉没有时间在这个帖子上重写rCTE示例,以显示它有多糟糕。如果您对rCTE为何如此糟糕并且您不介意SQLServerCentral.com会员资格感兴趣,那么这里有一篇关于这个主题的文章。我会在这里发布所有内容,但这样做太长了。
答案 2 :(得分:1)
这是使用Oracle的一种方法(您可以通过修改生成连续数字的子查询将其切换到其他引擎,请参阅下文)。该查询背后的想法是生成连续的乘数列表(例如,0,1,2,3 ......,n)直到窗口大小(日期之间的天数)。这是子查询返回的内容。我们使用它与事件表交叉连接,然后将结果限制为请求的日期范围。
SELECT t.event_id, t.event_title, t.event_date + t.occurs_every*x.r event_date
FROM tally_table t CROSS JOIN (
SELECT rownum-1 r FROM DUAL
connect by level <= (date '2011-1-20' - date '2011-1-6') + 1
) x
WHERE t.event_date + t.occurs_every*x.r <= date '2011-1-20'
ORDER BY t.event_date + t.occurs_every*x.r, t.event_id;
查询中的tally_table是您在问题中指定的表格。