我已经看到了许多不同的方法来创建和填充数字表。但是,创建和填充一个的最佳方法是什么?将“最佳”定义为从最重要到最不重要:
如果你不知道数字表是什么,请看这里:Why should I consider using an auxiliary numbers table?
答案 0 :(得分:121)
以下是从网络和此问题的答案中获取的一些代码示例。
对于每个方法,我修改了原始代码,因此每个使用相同的表和列:NumbersTest和Number,10,000行或尽可能接近。另外,我提供了到原产地的链接。
方法1 这是here的非常慢的循环方法
平均13.01秒
跑了3次删除最高,这里是几秒钟的时间:12.42,13.60
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
CREATE TABLE NumbersTest(Number INT IDENTITY(1,1))
SET NOCOUNT ON
WHILE COALESCE(SCOPE_IDENTITY(), 0) < 100000
BEGIN
INSERT dbo.NumbersTest DEFAULT VALUES
END
SET NOCOUNT OFF
-- Add a primary key/clustered index to the numbers table
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE())/1000.0)+' seconds'
SELECT COUNT(*) FROM NumbersTest
方法2 这里是来自here的更快的循环
平均1.1658秒
跑了11次删除最高,这里是秒的时间:1.117,1.140,1.203,1.170,1.173,1.156,1.203,1.153,1.173,1.170
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
CREATE TABLE NumbersTest (Number INT NOT NULL);
DECLARE @i INT;
SELECT @i = 1;
SET NOCOUNT ON
WHILE @i <= 10000
BEGIN
INSERT INTO dbo.NumbersTest(Number) VALUES (@i);
SELECT @i = @i + 1;
END;
SET NOCOUNT OFF
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE())/1000.0)+' seconds'
SELECT COUNT(*) FROM NumbersTest
方法3 以下是基于here的代码的单个INSERT
平均488.6毫秒
跑了11次删除最高,这里是以毫秒为单位的时间:686,673,623,686,343,343,376,360,343,453
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
CREATE TABLE NumbersTest (Number int not null)
;WITH Nums(Number) AS
(SELECT 1 AS Number
UNION ALL
SELECT Number+1 FROM Nums where Number<10000
)
insert into NumbersTest(Number)
select Number from Nums option(maxrecursion 10000)
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest
方法4 这是here的“半循环”方法
平均348.3毫秒(由于代码中间的“GO”很难获得良好的计时,任何建议都将受到赞赏)
运行了11次删除最高,这里是以毫秒为单位的时间:356,360,283,346,360,376,326,373,330,373
DROP TABLE NumbersTest
DROP TABLE #RunDate
CREATE TABLE #RunDate (RunDate datetime)
INSERT INTO #RunDate VALUES(GETDATE())
CREATE TABLE NumbersTest (Number int NOT NULL);
INSERT NumbersTest values (1);
GO --required
INSERT NumbersTest SELECT Number + (SELECT COUNT(*) FROM NumbersTest) FROM NumbersTest
GO 14 --will create 16384 total rows
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
SELECT CONVERT(varchar(20),datediff(ms,RunDate,GETDATE()))+' milliseconds' FROM #RunDate
SELECT COUNT(*) FROM NumbersTest
方法5 这里是来自Philip Kelley's answer的单个INSERT
平均92.7毫秒
跑了11次删除最高,这里是以毫秒为单位的时间:80,96,96,93,110,110,80,76,93,93
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
CREATE TABLE NumbersTest (Number int not null)
;WITH
Pass0 as (select 1 as C union all select 1), --2 rows
Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
--I removed Pass5, since I'm only populating the Numbers table to 10,000
Tally as (select row_number() over(order by C) as Number from Pass4)
INSERT NumbersTest
(Number)
SELECT Number
FROM Tally
WHERE Number <= 10000
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest
方法6 这里是来自Mladen Prajdic answer的单个INSERT
平均82.3毫秒
跑了11次删除最高,这里是以毫秒为单位的时间:80,80,93,76,93,63,93,76,93,76
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
CREATE TABLE NumbersTest (Number int not null)
INSERT INTO NumbersTest(Number)
SELECT TOP 10000 row_number() over(order by t1.number) as N
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number);
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest
方法7 这里是基于here的代码的单个INSERT
平均56.3毫秒
跑了11次删除最高,这里是以毫秒为单位的时间:63,50,63,46,60,63,63,46,63,46
DROP TABLE NumbersTest
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO NumbersTest
FROM sys.objects s1 --use sys.columns if you don't get enough rows returned to generate all the numbers you need
CROSS JOIN sys.objects s2 --use sys.columns if you don't get enough rows returned to generate all the numbers you need
ALTER TABLE NumbersTest ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
SELECT COUNT(*) FROM NumbersTest
在查看所有这些方法之后,我真的很喜欢方法7,这是最快的,代码也很简单。
答案 1 :(得分:51)
我用这个很快就像地狱一样:
insert into Numbers(N)
select top 1000000 row_number() over(order by t1.number) as N
from master..spt_values t1
cross join master..spt_values t2
答案 2 :(得分:18)
如果您只是在SQL Server Management Studio或sqlcmd中执行此操作,则可以使用批处理分隔符允许您重复批处理的事实:
CREATE TABLE Number (N INT IDENTITY(1,1) PRIMARY KEY NOT NULL);
GO
INSERT INTO Number DEFAULT VALUES;
GO 100000
这会将100000条记录插入Numbers
表格。
速度慢。它与@KM答案中的方法1进行了比较,这是最慢的例子。然而,它就像它获得的代码光一样。您可以通过在插入批处理后添加主键约束来加快速度。
答案 3 :(得分:11)
我从以下模板开始,该模板来源于Itzik Ben-Gan常规的大量印刷品:
;WITH
Pass0 as (select 1 as C union all select 1), --2 rows
Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows
Tally as (select row_number() over(order by C) as Number from Pass5)
select Number from Tally where Number <= 1000000
“WHERE N&lt; = 1000000”子句将输出限制为1到1百万,并且可以轻松调整到所需范围。
由于这是一个WITH子句,因此可以将其设置为INSERT ... SELECT ...就像这样:
-- Sample use: create one million rows
CREATE TABLE dbo.Example (ExampleId int not null)
DECLARE @RowsToCreate int
SET @RowsToCreate = 1000000
-- "Table of numbers" data generator, as per Itzik Ben-Gan (from multiple sources)
;WITH
Pass0 as (select 1 as C union all select 1), --2 rows
Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows
Tally as (select row_number() over(order by C) as Number from Pass5)
INSERT Example (ExampleId)
select Number
from Tally
where Number <= @RowsToCreate
在构建表之后对表进行索引将是索引它的最快方法。
哦,我将它称为“Tally”表。我认为这是一个常用术语,你可以通过谷歌搜索找到大量的技巧和例子。
答案 4 :(得分:4)
对于任何寻找Azure解决方案的人
SET NOCOUNT ON
CREATE TABLE Numbers (n bigint PRIMARY KEY)
GO
DECLARE @numbers table(number int);
WITH numbers(number) as (
SELECT 1 AS number
UNION all
SELECT number+1 FROM numbers WHERE number<10000
)
INSERT INTO @numbers(number)
SELECT number FROM numbers OPTION(maxrecursion 10000)
INSERT INTO Numbers(n) SELECT number FROM @numbers
来自sql azure团队博客 http://azure.microsoft.com/blog/2010/09/16/create-a-numbers-table-in-sql-azure/
答案 5 :(得分:2)
我使用数字表来主要在BIRT中编写报告,而不必动摇创建记录集。
我对日期做同样的事情,其表格从过去的10年到未来的10年(以及更详细报道的一天中的小时)。即使您的“真实”数据表没有数据,也能获得所有日期的值,这是一个巧妙的技巧。
我有一个脚本用于创建这些,类似于(这是来自内存):
drop table numbers; commit;
create table numbers (n integer primary key); commit;
insert into numbers values (0); commit;
insert into numbers select n+1 from numbers; commit;
insert into numbers select n+2 from numbers; commit;
insert into numbers select n+4 from numbers; commit;
insert into numbers select n+8 from numbers; commit;
insert into numbers select n+16 from numbers; commit;
insert into numbers select n+32 from numbers; commit;
insert into numbers select n+64 from numbers; commit;
每行的行数加倍,因此生成真正庞大的表格并不需要花费太多。
我不确定我是否同意你的意见,因为你只需要创建一次,所以快速创建很重要。它的成本在对它的所有访问中摊销,使得时间相当微不足道。
答案 6 :(得分:2)
以下是一些额外的方法:
方法1
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i int = 1;
INSERT INTO dbo.Numbers (Number)
VALUES (1),(2);
WHILE 2*@i < 1048576
BEGIN
INSERT INTO dbo.Numbers (Number)
SELECT Number + 2*@i
FROM dbo.Numbers;
SET @i = @@ROWCOUNT;
END
GO
SELECT COUNT(*) FROM Numbers AS RowCownt --1048576 rows
方法2
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE dbo.Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i INT = 0;
INSERT INTO dbo.Numbers (Number)
VALUES (1);
WHILE @i <= 9
BEGIN
INSERT INTO dbo.Numbers (Number)
SELECT N.Number + POWER(4, @i) * D.Digit
FROM dbo.Numbers AS N
CROSS JOIN (VALUES(1),(2),(3)) AS D(Digit)
ORDER BY D.Digit, N.Number
SET @i = @i + 1;
END
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
方法3
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int identity NOT NULL PRIMARY KEY, T bit NULL);
WITH
T1(T) AS (SELECT T FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) AS T(T)) --10 rows
,T2(T) AS (SELECT A.T FROM T1 AS A CROSS JOIN T1 AS B CROSS JOIN T1 AS C) --1,000 rows
,T3(T) AS (SELECT A.T FROM T2 AS A CROSS JOIN T2 AS B CROSS JOIN T2 AS C) --1,000,000,000 rows
INSERT INTO dbo.Numbers(T)
SELECT TOP (1048576) NULL
FROM T3;
ALTER TABLE Numbers
DROP COLUMN T;
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
方法4 ,摘自Alex Kuznetsov的Defensive Database Programming书
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i INT = 1 ;
INSERT INTO dbo.Numbers (Number)
VALUES (1);
WHILE @i < 524289 --1048576
BEGIN;
INSERT INTO dbo.Numbers (Number)
SELECT Number + @i
FROM dbo.Numbers;
SET @i = @i * 2 ;
END
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
方法5 ,取自Erland Sommarskog撰写的Arrays and Lists in SQL Server 2005 and Beyond文章
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
WITH digits (d) AS (
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL
SELECT 0)
INSERT INTO Numbers (Number)
SELECT Number
FROM (SELECT i.d + ii.d * 10 + iii.d * 100 + iv.d * 1000 +
v.d * 10000 + vi.d * 100000 AS Number
FROM digits i
CROSS JOIN digits ii
CROSS JOIN digits iii
CROSS JOIN digits iv
CROSS JOIN digits v
CROSS JOIN digits vi) AS Numbers
WHERE Number > 0
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --999999 rows
<强>要点:强>
在这5种方法中,方法3似乎是最快的。
答案 7 :(得分:2)
这是一个简短而快速的内存中解决方案,我使用SQL Server 2008中引入的Table Valued Constructors来实现:
它将返回1,000,000行,但您可以添加/删除CROSS JOIN,或使用TOP子句来修改它。
;WITH v AS (SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) v(z))
SELECT N FROM (SELECT ROW_NUMBER() OVER (ORDER BY v1.z)-1 N FROM v v1
CROSS JOIN v v2 CROSS JOIN v v3 CROSS JOIN v v4 CROSS JOIN v v5 CROSS JOIN v v6) Nums
请注意,这可以快速计算,或者(甚至更好)存储在永久表中(只需在INTO
段之后添加SELECT N
子句),并在{ {1}}提高效率的领域。
答案 8 :(得分:2)
我知道这个帖子已经老了并且已经回答了,但是有一种方法可以从方法7中挤出一些额外的性能:
而不是这个(基本上是方法7,但有一些易用性抛光):
DECLARE @BIT AS BIT = 0
IF OBJECT_ID('tempdb..#TALLY') IS NOT NULL
DROP TABLE #TALLY
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO #TALLY
FROM (SELECT @BIT [X] UNION ALL SELECT @BIT) [T2]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T4]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T8]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T16]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T32]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T64]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T128]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T256]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T512]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T1024]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T2048]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T4096]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T8192]
CROSS JOIN (SELECT @BIT [X] UNION ALL SELECT @BIT) [T16384]
ALTER TABLE #TALLY ADD PRIMARY KEY(Number)
PRINT CONVERT(varchar(20),datediff(ms,@RunDate,GETDATE()))+' milliseconds'
试试这个:
DECLARE @BIT AS BIT = 0
在我的服务器上,这需要大约10毫秒,而从sys.objects中选择大约需要16-20毫秒。它还有一个额外的好处,就是不依赖于sys.objects中有多少个对象。虽然它非常安全,但它在技术上是一种依赖,而另一种则更快。我认为如果改变,速度提升就会降低到使用BIT:
DECLARE @BIT AS BIGINT = 0
为:
client_id
它在我的服务器上增加了大约8-10毫秒的总时间。也就是说,当你扩展到1,000,000条记录时,BIT和BIGINT不再会对我的查询产生明显的影响,但它仍然会在sys.objects中运行~680ms vs~730ms左右。
答案 9 :(得分:1)
一些建议的方法基于系统对象(例如在'sys.objects'上)。他们假设这些系统对象包含足够的记录来生成我们的数字。
我不会基于任何不属于我的应用程序的东西以及我没有完全控制的东西。例如:这些sys表的内容可能会发生变化,新SQL版本中的表可能不再有效。
作为解决方案,我们可以使用记录创建自己的表。然后我们使用那个而不是这些系统相关的对象(如果我们事先知道范围,那么所有数字的表应该没有问题,否则我们可以去做一个交叉连接)。
基于CTE的解决方案工作正常但它有与嵌套循环相关的限制。
答案 10 :(得分:0)
这是对已接受答案的重新包装-但以一种可以让您自己将它们彼此比较的方式-比较了前3种算法(并且注释说明了为什么排除其他方法),并且您可以对自己的方法进行反击自己的设置,以查看它们各自如何以所需的序列大小执行。
SET NOCOUNT ON;
--
-- Set the count of numbers that you want in your sequence ...
--
DECLARE @NumberOfNumbers int = 10000000;
--
-- Some notes on choosing a useful length for your sequence ...
-- For a sequence of 100 numbers -- winner depends on preference of min/max/avg runtime ... (I prefer PhilKelley algo here - edit the algo so RowSet2 is max RowSet CTE)
-- For a sequence of 1k numbers -- winner depends on preference of min/max/avg runtime ... (Sadly PhilKelley algo is generally lowest ranked in this bucket, but could be tweaked to perform better)
-- For a sequence of 10k numbers -- a clear winner emerges for this bucket
-- For a sequence of 100k numbers -- do not test any looping methods at this size or above ...
-- the previous winner fails, a different method is need to guarantee the full sequence desired
-- For a sequence of 1MM numbers -- the statistics aren't changing much between the algorithms - choose one based on your own goals or tweaks
-- For a sequence of 10MM numbers -- only one of the methods yields the desired sequence, and the numbers are much closer than for smaller sequences
DECLARE @TestIteration int = 0;
DECLARE @MaxIterations int = 10;
DECLARE @MethodName varchar(128);
-- SQL SERVER 2017 Syntax/Support needed
DROP TABLE IF EXISTS #TimingTest
CREATE TABLE #TimingTest (MethodName varchar(128), TestIteration int, StartDate DateTime2, EndDate DateTime2, ElapsedTime decimal(38,0), ItemCount decimal(38,0), MaxNumber decimal(38,0), MinNumber decimal(38,0))
--
-- Conduct the test ...
--
WHILE @TestIteration < @MaxIterations
BEGIN
-- Be sure that the test moves forward
SET @TestIteration += 1;
/* -- This method has been removed, as it is BY FAR, the slowest method
-- This test shows that, looping should be avoided, likely at all costs, if one places a value / premium on speed of execution ...
--
-- METHOD - Fast looping
--
-- Prep for the test
DROP TABLE IF EXISTS [Numbers].[Test];
CREATE TABLE [Numbers].[Test] (Number INT NOT NULL);
-- Method information
SET @MethodName = 'FastLoop';
-- Record the start of the test
INSERT INTO #TimingTest(MethodName, TestIteration, StartDate)
SELECT @MethodName, @TestIteration, GETDATE()
-- Run the algorithm
DECLARE @i INT = 1;
WHILE @i <= @NumberOfNumbers
BEGIN
INSERT INTO [Numbers].[Test](Number) VALUES (@i);
SELECT @i = @i + 1;
END;
ALTER TABLE [Numbers].[Test] ADD CONSTRAINT PK_Numbers_Test_Number PRIMARY KEY CLUSTERED (Number)
-- Record the end of the test
UPDATE tt
SET
EndDate = GETDATE()
FROM #TimingTest tt
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
-- And the stats about the numbers in the sequence
UPDATE tt
SET
ItemCount = results.ItemCount,
MaxNumber = results.MaxNumber,
MinNumber = results.MinNumber
FROM #TimingTest tt
CROSS JOIN (
SELECT COUNT(Number) as ItemCount, MAX(Number) as MaxNumber, MIN(Number) as MinNumber FROM [Numbers].[Test]
) results
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
*/
/* -- This method requires GO statements, which would break the script, also - this answer does not appear to be the fastest *AND* seems to perform "magic"
--
-- METHOD - "Semi-Looping"
--
-- Prep for the test
DROP TABLE IF EXISTS [Numbers].[Test];
CREATE TABLE [Numbers].[Test] (Number INT NOT NULL);
-- Method information
SET @MethodName = 'SemiLoop';
-- Record the start of the test
INSERT INTO #TimingTest(MethodName, TestIteration, StartDate)
SELECT @MethodName, @TestIteration, GETDATE()
-- Run the algorithm
INSERT [Numbers].[Test] values (1);
-- GO --required
INSERT [Numbers].[Test] SELECT Number + (SELECT COUNT(*) FROM [Numbers].[Test]) FROM [Numbers].[Test]
-- GO 14 --will create 16384 total rows
ALTER TABLE [Numbers].[Test] ADD CONSTRAINT PK_Numbers_Test_Number PRIMARY KEY CLUSTERED (Number)
-- Record the end of the test
UPDATE tt
SET
EndDate = GETDATE()
FROM #TimingTest tt
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
-- And the stats about the numbers in the sequence
UPDATE tt
SET
ItemCount = results.ItemCount,
MaxNumber = results.MaxNumber,
MinNumber = results.MinNumber
FROM #TimingTest tt
CROSS JOIN (
SELECT COUNT(Number) as ItemCount, MAX(Number) as MaxNumber, MIN(Number) as MinNumber FROM [Numbers].[Test]
) results
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
*/
--
-- METHOD - Philip Kelley's algo
-- (needs tweaking to match the desired length of sequence in order to optimize its performance, relies more on the coder to properly tweak the algorithm)
--
-- Prep for the test
DROP TABLE IF EXISTS [Numbers].[Test];
CREATE TABLE [Numbers].[Test] (Number INT NOT NULL);
-- Method information
SET @MethodName = 'PhilKelley';
-- Record the start of the test
INSERT INTO #TimingTest(MethodName, TestIteration, StartDate)
SELECT @MethodName, @TestIteration, GETDATE()
-- Run the algorithm
; WITH
RowSet0 as (select 1 as Item union all select 1), -- 2 rows -- We only have to name the column in the first select, the second/union select inherits the column name
RowSet1 as (select 1 as Item from RowSet0 as A, RowSet0 as B), -- 4 rows
RowSet2 as (select 1 as Item from RowSet1 as A, RowSet1 as B), -- 16 rows
RowSet3 as (select 1 as Item from RowSet2 as A, RowSet2 as B), -- 256 rows
RowSet4 as (select 1 as Item from RowSet3 as A, RowSet3 as B), -- 65536 rows (65k)
RowSet5 as (select 1 as Item from RowSet4 as A, RowSet4 as B), -- 4294967296 rows (4BB)
-- Add more RowSetX to get higher and higher numbers of rows
-- Each successive RowSetX results in squaring the previously available number of rows
Tally as (select row_number() over (order by Item) as Number from RowSet5) -- This is what gives us the sequence of integers, always select from the terminal CTE expression
-- Note: testing of this specific use case has shown that making Tally as a sub-query instead of a terminal CTE expression is slower (always) - be sure to follow this pattern closely for max performance
INSERT INTO [Numbers].[Test] (Number)
SELECT o.Number
FROM Tally o
WHERE o.Number <= @NumberOfNumbers
ALTER TABLE [Numbers].[Test] ADD CONSTRAINT PK_Numbers_Test_Number PRIMARY KEY CLUSTERED (Number)
-- Record the end of the test
UPDATE tt
SET
EndDate = GETDATE()
FROM #TimingTest tt
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
-- And the stats about the numbers in the sequence
UPDATE tt
SET
ItemCount = results.ItemCount,
MaxNumber = results.MaxNumber,
MinNumber = results.MinNumber
FROM #TimingTest tt
CROSS JOIN (
SELECT COUNT(Number) as ItemCount, MAX(Number) as MaxNumber, MIN(Number) as MinNumber FROM [Numbers].[Test]
) results
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
--
-- METHOD - Mladen Prajdic answer
--
-- Prep for the test
DROP TABLE IF EXISTS [Numbers].[Test];
CREATE TABLE [Numbers].[Test] (Number INT NOT NULL);
-- Method information
SET @MethodName = 'MladenPrajdic';
-- Record the start of the test
INSERT INTO #TimingTest(MethodName, TestIteration, StartDate)
SELECT @MethodName, @TestIteration, GETDATE()
-- Run the algorithm
INSERT INTO [Numbers].[Test](Number)
SELECT TOP (@NumberOfNumbers) row_number() over(order by t1.number) as N
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
ALTER TABLE [Numbers].[Test] ADD CONSTRAINT PK_Numbers_Test_Number PRIMARY KEY CLUSTERED (Number)
-- Record the end of the test
UPDATE tt
SET
EndDate = GETDATE()
FROM #TimingTest tt
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
-- And the stats about the numbers in the sequence
UPDATE tt
SET
ItemCount = results.ItemCount,
MaxNumber = results.MaxNumber,
MinNumber = results.MinNumber
FROM #TimingTest tt
CROSS JOIN (
SELECT COUNT(Number) as ItemCount, MAX(Number) as MaxNumber, MIN(Number) as MinNumber FROM [Numbers].[Test]
) results
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
--
-- METHOD - Single INSERT
--
-- Prep for the test
DROP TABLE IF EXISTS [Numbers].[Test];
-- The Table creation is part of this algorithm ...
-- Method information
SET @MethodName = 'SingleInsert';
-- Record the start of the test
INSERT INTO #TimingTest(MethodName, TestIteration, StartDate)
SELECT @MethodName, @TestIteration, GETDATE()
-- Run the algorithm
SELECT TOP (@NumberOfNumbers) IDENTITY(int,1,1) AS Number
INTO [Numbers].[Test]
FROM sys.objects s1 -- use sys.columns if you don't get enough rows returned to generate all the numbers you need
CROSS JOIN sys.objects s2 -- use sys.columns if you don't get enough rows returned to generate all the numbers you need
ALTER TABLE [Numbers].[Test] ADD CONSTRAINT PK_Numbers_Test_Number PRIMARY KEY CLUSTERED (Number)
-- Record the end of the test
UPDATE tt
SET
EndDate = GETDATE()
FROM #TimingTest tt
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
-- And the stats about the numbers in the sequence
UPDATE tt
SET
ItemCount = results.ItemCount,
MaxNumber = results.MaxNumber,
MinNumber = results.MinNumber
FROM #TimingTest tt
CROSS JOIN (
SELECT COUNT(Number) as ItemCount, MAX(Number) as MaxNumber, MIN(Number) as MinNumber FROM [Numbers].[Test]
) results
WHERE tt.MethodName = @MethodName
and tt.TestIteration = @TestIteration
END
-- Calculate the timespan for each of the runs
UPDATE tt
SET
ElapsedTime = DATEDIFF(MICROSECOND, StartDate, EndDate)
FROM #TimingTest tt
--
-- Report the results ...
--
SELECT
MethodName, AVG(ElapsedTime) / AVG(ItemCount) as TimePerRecord, CAST(AVG(ItemCount) as bigint) as SequenceLength,
MAX(ElapsedTime) as MaxTime, MIN(ElapsedTime) as MinTime,
MAX(MaxNumber) as MaxNumber, MIN(MinNumber) as MinNumber
FROM #TimingTest tt
GROUP by tt.MethodName
ORDER BY TimePerRecord ASC, MaxTime ASC, MinTime ASC