使用CTE优化时态表

时间:2019-03-04 18:49:24

标签: sql sql-server tsql common-table-expression sql-server-2017

我创建时间表以设置级别:

CREATE TABLE [#DesignLvl]
(
    [DesignKey] INT,
    [DesignLevel] INT
);

WITH RCTE AS 
(
    SELECT
        *,
        1 AS [Lvl]
    FROM 
        [Design]
    WHERE 
        [ParentDesignKey] IS NULL

    UNION ALL

    SELECT
        [D].*,
        [Lvl] + 1 AS [Lvl]
    FROM 
        [dbo].[Design] AS [D]
    INNER JOIN 
        [RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
    SELECT
        [DesignKey], [Lvl]
    FROM 
        [RCTE]

创建后,我在大型查询中用作LEFT JOIN:

SELECT... 
FROM.. 
LEFT JOIN [#DesignLvl] AS [dl] ON d.DesignKey = dl.DesignKey
WHERE ...

查询有效,但性能下降,查询现在太慢了,有什么方法可以优化此表吗?

CTE的执行计划

enter image description here

我尝试将CLUSTERED索引添加为:

CREATE TABLE [#DesignLvl]
(
    [DesignKey] INT,
    [DesignLevel] INT
);

CREATE CLUSTERED INDEX ix_DesignLvl 
    ON [#DesignLvl] ([DesignKey], [DesignLevel]);

也尝试:

    CREATE TABLE [#DesignLvl] 
( [DesignKey] INT INDEX IX1 CLUSTERED ,
 [DesignLevel] INT INDEX IX2 NONCLUSTERED );

但是我得到相同的结果,执行花费了很长时间

8 个答案:

答案 0 :(得分:4)

由于在嵌套循环内访问dbo.Design表上的聚集索引,因此性能可能会降低。根据成本估算,数据库在扫描该索引上花费了66%的时间。循环播放只会使情况变得更糟。

请参见related question

请考虑将dbo.Design上的索引更改为非聚集索引,或尝试使用非聚集索引创建另一个临时表并将其用于递归查询:

CREATE TABLE [#DesignTemp]
(
    ParentDesignKey INT,
    DesignKey INT
);

-- Insert the data, then create the index.
INSERT INTO [#DesignTemp]
SELECT
ParentDesignKey,
DesignKey
FROM [dbo].[Design];

COMMIT;

-- Try this index, or create indexes for individual columns if the plan works better at high volumes.
CREATE NONCLUSTERED INDEX ix_DesignTemp1 ON [#DesignTemp] (ParentDesignKey, DesignKey);

CREATE TABLE [#DesignLvl]
(
    [DesignKey] INT,
    [DesignLevel] INT
);

WITH RCTE AS 
(
    SELECT
        *,
        1 AS [Lvl]
    FROM 
        [DesignTemp]
    WHERE 
        [ParentDesignKey] IS NULL

    UNION ALL

    SELECT
        [D].*,
        [Lvl] + 1 AS [Lvl]
    FROM 
        [DesignTemp] AS [D]
    INNER JOIN 
        [RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
    SELECT
        [DesignKey], [Lvl]
    FROM 
        [RCTE];

答案 1 :(得分:3)

您的问题不完整,查询缓慢,但是查询的哪一部分缓慢?

CTEQueryLEFT JOIN in really big query

我认为需要大量查询的脚本以及详细信息, 例如哪个表包含多少行,它们的数据类型等。

提供有关大查询的更多信息。

还让我们知道联接条件中是否涉及UDF。

您为什么left join临时表?为什么不INNER JOIN

分别测试性能或CTE和Big Query。

一旦在递归部分使用[D].[ParentDesignKey] is not null

SELECT
        [D].*,
        [Lvl] + 1 AS [Lvl]
    FROM 
        [dbo].[Design] AS [D]
    INNER JOIN 
        [RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
and [D].[ParentDesignKey] is not null

注意::在CTE中,仅使用需要的列。

如果有可能Pre- Calculate [Lvl],因为Recursive CTE的性能特别差,涉及很多记录。

每个CTE查询平均要处理多少行?

如果临时表将容纳超过100 rows,则在其上创建聚簇索引,

  CREATE CLUSTERED INDEX ix_DesignLvl 
        ON [#DesignLvl] ([DesignKey], [DesignLevel]);

如果在联接条件中未使用[DesignLevel],则从索引中删除。

此外,表[dbo].[Design]的显示索引以及DesignKey和ParentDesignKey的数据很少。

获得Index Scan的原因很多,其中之一就是Selectivity of Key

那么一个DesignKey可以有多少行,一个ParentDesignKey可以有多少行?

因此,根据以上对表Create Composite Clustered Index的两个键的回答[dbo].[Design]

所以考虑我的答案不完整,我将相应地对其进行更新。

答案 2 :(得分:2)

根据我在this article上发布的测试,基于集合的循环可以使您的性能优于递归CTE。

DECLARE @DesignLevel int = 0;

INSERT INTO [#DesignLvl]
SELECT [DesignKey], 1
FROM [RCTE];

WHILE @@ROWCOUNT > 0
BEGIN
    SET @DesignLevel += 1;

    INSERT INTO [#DesignLvl]
    SELECT [D].[DesignKey], dl.DesignLevel
    FROM [dbo].[Design] AS [D]
    JOIN [#DesignLvl] AS [dl] ON [dl].[DesignKey] = [D].[ParentDesignKey]
    WHERE dl.DesignLevel = @DesignLevel;
END;

答案 3 :(得分:2)

尝试@table,使用内存临时表而不是磁盘临时表进行查询

declare @DesignLvl table
(
    [DesignKey] INT,
    [DesignLevel] INT
);

WITH RCTE AS 
(
    SELECT
        *,
        1 AS [Lvl]
    FROM 
        [Design]
    WHERE 
        [ParentDesignKey] IS NULL

    UNION ALL

    SELECT
        [D].*,
        [Lvl] + 1 AS [Lvl]
    FROM 
        [dbo].[Design] AS [D]
    INNER JOIN 
        [RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO @DesignLvl
    SELECT
        [DesignKey], [Lvl]
    FROM 
        [RCTE]

可能会有所帮助,我们在谈论多少行以及什么sql server版本? @@ version?

答案 4 :(得分:2)

您尝试过memory optimized tables吗?我在类似的过程(递归CTE)中使用了它们,并获得了惊人的效果。在SQL Server 2017中也应包括在Standard Edition中。首先,您需要为内存优化的数据创建文件组:

ALTER DATABASE MyDB 
ADD FILEGROUP mem_data CONTAINS MEMORY_OPTIMIZED_DATA; 
GO 
ALTER DATABASE MyDB 
ADD FILE (NAME = 'MemData', FILENAME = 'D:\Data\MyDB_MemData.ndf') TO FILEGROUP mem_data; 

然后您创建(或转换)表:

CREATETABLE dbo.MemoryTable
(
Col1 INT IDENTITY PRIMARY KEY
...
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

答案 5 :(得分:2)

您是否尝试将SELECT *更改为SELECT DesignLevel,我发现对于宽行这足以更改执行计划,选择使用急切的线轴进行索引扫描:

WITH RCTE AS 
(
    SELECT
        [DesignKey],
        1 AS [Lvl]
    FROM 
        [Design]
    WHERE 
        [ParentDesignKey] IS NULL

    UNION ALL

    SELECT
        [D].[DesignKey],
        [Lvl] + 1 AS [Lvl]
    FROM 
        [dbo].[Design] AS [D]
    INNER JOIN 
        [RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
    SELECT
        [DesignKey], [Lvl]
    FROM 
        [RCTE]

可以在这里找到计划和测试SQL:https://www.brentozar.com/pastetheplan/?id=BymxTD4wV

答案 6 :(得分:1)

问题可能出在设计表很大,并且在没有任何主要过滤条件的情况下将其与自身连接会导致整个表的扫描。

因为您只对很少的列(如designkey和parentdesignkey)感兴趣,请尝试将数据填充查询(插入#designlvl)分成多个部分。

确保您有索引(designkey,parentdesignkey)

插入#DesignLevel SELECT DISTINCT DesignKey,从1那里的设计那里ParentDesignKey是NULL

插入#DesignLevel SELECT DISTINCT ParentDesignKey,Lvl + 1从设计这里ParentDesignKey不为空

答案 7 :(得分:0)

Make sure there are no nulls in DesignKey.ParentDesignKey and #DesignLv1.DesignKey 
columns and if so, use is not null constraint where you can. i have seen nulls to create cross joins.

If Design table is a transactional table that is being written to very frequently, rebuild indexes on this table frequently.

Create one non clustered index on Design.DesignKey and Design.ParentDesignKey in that sequence.

Create a non clustered index on #DesignLvl DesignKey.

If Design table is large ( > 10 million rows) and a whole bunch of columns, create a indexed view of the distinct columns that you need only for this query and use that.

Check System event log for disk read write failures on disk that has tempdb and (You should put the tempdb on either a RAID 1 or RAID 10 array as they're optimized for high-write applications.) from ( https://searchsqlserver.techtarget.com/tip/SQL-Server-tempdb-best-practices-increase-performance )