使用递归SQL(CTE)降低性能

时间:2017-04-17 15:27:53

标签: sql sql-server performance tsql common-table-expression

很抱歉,如果这是一个很长的问题,但没有简单的方法来表达它。

我有以下查询

SELECT 
    S.*
FROM 
    Stock S
LEFT JOIN 
    Stock_Category SC ON SC.StockId = S.Id
WHERE 
    S.Published = 1 
    AND (@CategoryId IS NULL OR 
         (SELECT COUNT(*) 
          FROM GetParentCategoriesByCategoryId(SC.CategoryId) 
          WHERE Id = @CategoryId) > 0) 

GetParentCategoriesByCategoryId()内,我有以下公用表表达式(CTE):

DECLARE @TableOutput TABLE(Id UNIQUEIDENTIFIER, 
                            PosDissectionId INT,
                            PosFamilyClassId INT,
                            ParentId UNIQUEIDENTIFIER,
                            Code NVARCHAR(25),
                            [Name] NVARCHAR(100),
                            Description NVARCHAR(1000),
                            AzureId UNIQUEIDENTIFIER,
                            Extension NVARCHAR(10),
                            Visible BIT,
                            OrderIndex INT,
                            StockCount INT,
                            Depth INT)
BEGIN
    DECLARE @TotalVisible INT,
            @TotalRows INT

    ;WITH CategoryStructure (Id, ParentId, ParentName, Name, Depth, Visible)
    As 
    ( 
        SELECT 
            C.Id, 
            C.ParentId, 
            CAST('' AS NVARCHAR(500)) AS ParentName, 
            C.Name, 
            0 AS Depth, 
            C.Visible
        FROM 
            Category C
        WHERE 
            Id = @LocalCategoryId

        UNION ALL

        SELECT 
            ParentCategory.Id, 
            ParentCategory.ParentId, 
            CategoryStructure.Name AS ParentName, 
            ParentCategory.Name, 
            CategoryStructure.Depth + 1,
            ParentCategory.Visible
        FROM 
            Category ParentCategory
        INNER JOIN 
            CategoryStructure ON ParentCategory.Id = CategoryStructure.ParentId
    )
    INSERT INTO @TableOutput
        SELECT          
            C.*,
            SC.StockCount,
            CS.Depth
        FROM 
            CategoryStructure CS 
        INNER JOIN 
            Category C ON  C.Id = CS.Id
        LEFT JOIN 
            (SELECT CategoryId, COUNT(*) AS StockCount 
             FROM Stock_Category SC
             INNER JOIN Stock S ON S.Id = SC.StockId
             WHERE S.Published = 1 AND 
                 ((S.WidthMM IS NOT NULL AND 
                   S.HeightMM IS NOT NULL AND 
                   S.DepthMM IS NOT NULL AND
                    S.WeightG IS NOT NULL)) AND
                CategoryId IN(SELECT CategoryId FROM CategoryStructure)
        GROUP BY CategoryId

    ) SC ON SC.CategoryId = CS.Id

    WHERE (@IncludeSelf = 1 OR CS.Id != @CategoryId) 

    SELECT 
        @TotalVisible = SUM(CONVERT(INT, Visible)),
        @TotalRows = COUNT(*) 
    FROM @TableOutput

    IF @TotalVisible <> @TotalRows
        DELETE FROM @TableOutput    

    RETURN
END

我的查询执行计划如下所示。

enter image description here

不幸的是,我获得了超过7秒的2000行查询时间。我相信我已经添加了正确的索引(它似乎表明查询正在使用它们)。

我已经能够将问题缩小到CTE中的LEFT JOIN

   SELECT CategoryId, COUNT(*) AS StockCount 
   FROM Stock_Category SC
   INNER JOIN Stock S ON S.Id = SC.StockId
   WHERE S.Published = 1 AND blah blah blah....

因为当我删除它时性能急剧增加但是到目前为止我可以推断出这一点。

我不期待一个解决方案,因为我理解它基于很多因素,但我远非SQL专家,我希望有人可以就我可能需要寻找的内容提供任何指导?

表格的架构可在此处找到:https://www.dropbox.com/s/tpetq6fky58fhti/schemas.sql?dl=0

3 个答案:

答案 0 :(得分:0)

我做了两处修改:

1)使函数内联(HAVING子句)

2)将LEFT JOIN替换为外部应用。

WITH CategoryStructure (Id, ParentId, ParentName, Name, Depth, Visible)
As 
( 
    SELECT 
        C.Id, 
        C.ParentId, 
        CAST('' AS NVARCHAR(500)) AS ParentName, 
        C.Name, 
        0 AS Depth, 
        C.Visible
    FROM 
        Category C
    WHERE 
        Id = @LocalCategoryId

    UNION ALL

    SELECT 
        ParentCategory.Id, 
        ParentCategory.ParentId, 
        CategoryStructure.Name AS ParentName, 
        ParentCategory.Name, 
        CategoryStructure.Depth + 1,
        ParentCategory.Visible
    FROM 
        Category ParentCategory
    INNER JOIN 
        CategoryStructure ON ParentCategory.Id = CategoryStructure.ParentId
)
INSERT INTO @TableOutput
    SELECT          
        C.*,
        SC.StockCount,
        CS.Depth
    FROM 
        CategoryStructure CS 
    INNER JOIN 
        Category C ON  C.Id = CS.Id
    OUTER APPLY
        (SELECT CategoryId, COUNT(*) AS StockCount 
         FROM Stock_Category SC
         INNER JOIN Stock S ON S.Id = SC.StockId
         WHERE S.Published = 1 AND 
             ((S.WidthMM IS NOT NULL AND 
               S.HeightMM IS NOT NULL AND 
               S.DepthMM IS NOT NULL AND
                S.WeightG IS NOT NULL)) AND
            CategoryId = CS.Id
        ) SC

WHERE (@IncludeSelf = 1 OR CS.Id != @CategoryId) 
HAVING SUM(CONVERT(INT, Visible)) = COUNT(*)

P.S。第一个查询看起来很奇怪(你有@CategoryId参数,但不要求它。你构建所有可能的树然后过滤)。我认为你的算法有误,有可能写GetParentCategoriesByCategoryId(@CategoryId)吗?

答案 1 :(得分:0)

因此,对于任何好奇的人来说,最终解决方案涉及重做我的索引,利用上述评论中的一些建议,并且重要的是删除临时表。

最后,我设法将查询降低到不到1秒,这就是目标。 但我不太确定Group By,想知道是否有更好的方法来做到这一点?还有其他人有进一步的改进吗?

var tinyCache: NSCache<NSString, NSString> = NSCache<NSString, NSString>()
 let keys = tinyCache.keys
 let values = tinyCache.values

答案 2 :(得分:-1)

两件事:

  1. 将in子句更改为内连接:

    SELECT CategoryId, COUNT(*) AS StockCount 
    FROM Stock_Category SC
    INNER JOIN Stock S ON S.Id = SC.StockId
    WHERE S.Published = 1 AND 
    ((S.WidthMM IS NOT NULL AND 
    S.HeightMM IS NOT NULL AND 
    S.DepthMM IS NOT NULL AND
    S.WeightG IS NOT NULL)) AND
    CategoryId IN(SELECT CategoryId FROM CategoryStructure)
    GROUP BY CategoryId
    
  2. 到 -

    SELECT CategoryId, COUNT(*) AS StockCount 
    FROM Stock_Category SC
    INNER JOIN Stock S ON S.Id = SC.StockId
    inner join CategoryStructure as CS
    on CS.CategoryId = SC.CategoryId
    WHERE S.Published = 1 AND 
    ((S.WidthMM IS NOT NULL AND 
    S.HeightMM IS NOT NULL AND 
    S.DepthMM IS NOT NULL AND
    S.WeightG IS NOT NULL)) AND
    GROUP BY CategoryId
    
    1. 您的查询主要是花时间在IX_StockAllColumns上的索引搜索上。如果它确实是所有列上的非聚集索引,请在Published,WidthMM,HeightMM,DepthMM和WeightG列上创建新的非聚集索引。