找到表中值的最长序列

时间:2009-11-01 22:17:24

标签: sql sql-server tsql sql-server-2008 stored-procedures

这是一个SQL问题,我认为很难 - 我不确定是否可以在简单的SQL语句或存储过程中实现:

我想在表格的列中找到相同(已知)数字的最长序列号:

示例:

TABLE: 
DATE    SALEDITEMS
1/1/09       4
1/2/09       3
1/3/09       3
1/4/09       4
1/5/09       3

调用sp /句子为4将给1调用sp / sentecne为3将给2 因为连续2次连续3次。

我正在运行SQL Server 2008。

2 个答案:

答案 0 :(得分:1)

更新:我生成了一百万行随机数据,并放弃了递归CTE解决方案,因为它的查询计划没有很好地利用优化器中的索引。

但是,我最初发布的非递归解决方案效果很好,只要有一个额外的非聚集索引(SALEDITEMS,[DATE])。这是有道理的,因为查询需要在两个方向上进行过滤(按日期和按SALEDITEMS)。使用这个额外的索引,在我的(不是非常强大的)桌面数学上,一百万行的查询在2秒内返回。如果没有这个索引,查询就会变慢。

BTW,这是SQL Server基于成本的查询优化在某些情况下完全崩溃的一个很好的例子。递归CTE解决方案的成本(在我的PC上)为42,并且至少需要几分钟才能完成。非递归解决方案的成本为15,446(!!!),并在1.5秒内完成。故事的道德:在比较SQL Server查询计划时,不要认为成本必然与查询性能相关!

无论如何,这是我推荐的解决方案(我之前发布的相同的非递归CTE):

DECLARE @SALEDITEMS INT = 3;

WITH SalesNoMatch ([DATE], SALEDITEMS, NoMatchDate) 
AS 
(
    SELECT [DATE], SALEDITEMS, 
        (SELECT MIN([DATE]) FROM Sales s2 WHERE s2.SALEDITEMS <> @SALEDITEMS 
         AND s2.[DATE] > s1.[DATE]) as NoMatchDate
    FROM Sales s1
)
, SalesMatchCount ([DATE], ConsecutiveCount) AS
(
    SELECT [DATE], 1+(SELECT COUNT(1) FROM Sales s2 WHERE s2.[DATE] > s1.[DATE] AND s2.[DATE] < NoMatchDate)
    FROM SalesNoMatch s1
    WHERE s1.SALEDITEMS = @SALEDITEMS 
)
SELECT MAX(ConsecutiveCount) 
FROM SalesMatchCount;

这是我用来测试它的DDL,包括你需要的索引:

CREATE TABLE [Sales](
    [DATE] date NOT NULL,
    [SALEDITEMS] int NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX IX_Sales ON Sales ([DATE]);
CREATE UNIQUE NONCLUSTERED INDEX IX_Sales2 ON Sales (SALEDITEMS, [DATE]);

以下是我创建测试数据的方法 - 1,000,001行的升序日期,SALEDITEMS随机设置在1到10之间。

INSERT INTO Sales ([DATE], SALEDITEMS)
VALUES ('1/1/09', 5)

DECLARE @i int = 0;

WHILE (@i < 1000000)
BEGIN
    INSERT INTO Sales ([DATE], SALEDITEMS)
    SELECT DATEADD (d, 1, (SELECT MAX ([DATE]) FROM Sales)), ABS(CHECKSUM(NEWID())) % 10 + 1

    SET @i = @i + 1;
END

这是我放弃的递归CTE解决方案:     DECLARE @SALEDITEMS INT = 3;

-- recursive CTE solution (remember to set MAXRECURSION!)
WITH SalesRowNum ([DATE], SALEDITEMS, RowNum) 
AS 
(
    SELECT [DATE], SALEDITEMS, ROW_NUMBER() OVER (ORDER BY s1.[DATE]) as RowNum
    FROM Sales s1
)
, SalesCTE (RowNum, [DATE], ConsecutiveCount) 
AS 
( 
    SELECT s1.RowNum, s1.[DATE], 1 AS ConsecutiveCount
    FROM SalesRowNum s1 
    WHERE SALEDITEMS = @SALEDITEMS

    UNION ALL 

    SELECT s1.RowNum, s1.[DATE], ConsecutiveCount + 1 AS ConsecutiveCount
    FROM SalesRowNum s1 
    INNER JOIN SalesCTE s2 ON s1.RowNum = s2.RowNum + 1
    WHERE SALEDITEMS = @SALEDITEMS
) 
SELECT MAX(ConsecutiveCount) 
FROM SalesCTE;

答案 1 :(得分:0)

未经测试,因为您没有提供DDL和样本数据:

DECLARE @SALEDITEMS INT;
SET @SALEDITEMS=3;
SELECT MAX(cnt) FROM(
SELECT COUNT(*) FROM YourTable JOIN (
SELECT y1.[Date] AS d1, y2.[Date] AS d2
FROM YourTable AS y1 JOIN YourTable AS y2 
ON y1.SALEDITEMS=@SALEDITEMS AND y2.SALEDITEMS=@SALEDITEMS
AND NOT EXISTS(SELECT 1 FROM YourTable AS y 
WHERE y.SALEDITEMS<>@SALEDITEMS
AND y1.[Date] < y.[Date] AND y.[Date] < y2.[Date])
) AS t
WHERE [Date] BETWEEN t.d1 AND t.d2
) AS t;