使用LIKE和EXISTS子句时,优化数据库模式/索引以获得更快的查询结果

时间:2010-01-03 07:45:26

标签: sql-server sql-server-2005 query-optimization

在SQL 2005服务器数据库上实现树结构时,当使用 LIKE子句 EXISTS结合使用时,查询响应时间过长(查询超过5秒)条款

慢查询涉及两个表 - [SitePath_T] [UserSiteRight_T]

CREATE TABLE [dbo].[UserSiteRight_T](
      [UserID_i] [int] NOT NULL
    , [SiteID_i] [int] NOT NULL
    , CONSTRAINT [PKC_UserSiteRight_UserIDSiteID] PRIMARY KEY CLUSTERED ( [UserID_i] ASC, [SiteID_i] ASC )
    , CONSTRAINT [FK_UserSiteRight_UserID] FOREIGN KEY( [UserID_i] ) REFERENCES [dbo].[User_T] ( [ID_i] )
    , CONSTRAINT [FK_UserSiteRight_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [dbo].[Site_T] ( [ID_i] )
) 

[UserSiteRight_T] 表中 UserID_i = 2484 的行数(权利)非常小:545
UserID_i = 2484 是随机选择的)

此外,数据库相对较小 - [SitePath_T] 表中只有23000行:

CREATE TABLE [dbo].[SitePath_T] (
    [SiteID_i] INT NOT NULL,
    [Path_v] VARCHAR(255) NOT NULL,
    CONSTRAINT [PK_SitePath_PathSiteID] PRIMARY KEY CLUSTERED ( [Path_v] ASC, [SiteID_i] ASC ),
    CONSTRAINT [AK_SitePath_Path] UNIQUE NONCLUSTERED ( [Path_v] ASC ),
    CONSTRAINT [FK_SitePath_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [Site_T] ( [ID_i] )


DB Schema

我只想获得 SiteIDs ,其子网站可以通过某个 UserID (由 [UserSiteRight_T] 表格指定)访问:

SELECT sp.SiteID_i
  FROM SitePath_t sp
 WHERE EXISTS ( SELECT *
              FROM [dbo].[SitePath_T] usp
                 , [dbo].[UserSiteRight_T] uusr
             WHERE uusr.SiteID_i = usp.SiteID_i
               AND uusr.UserID_i = 2484
               AND usp.Path_v LIKE sp.Path_v+'%' )

下面你可以找到结果的一部分,其中只需要/返回列sp.SiteID_i - 我也添加了相关的相应 Path_v UserSiteRight_T.SiteID_i WHERE UserID = 2484 以及与 LIKE相匹配的相应 SitePath_T SiteID_i Path_v 条件:

sp.SiteID_i  sp.Path_v      [UserSiteRight_T].SiteID_i      usp.SiteID_i        usp.Path_v
1           '1.'                        NULL                10054               '1.10054.'
10054       '1.10054.'                  10054               10054               '1.10054.'
10275       '1.10275.'                  10275               10275               '1.10275.'
1533        '1.1533.'                   NULL                2697                '1.1533.2689.2693.2697.'
2689        '1.1533.2689.'              NULL                2697                '1.1533.2689.2693.2697.'
2693        '1.1533.2689.2693.'         NULL                2697                '1.1533.2689.2693.2697.'
2697        '1.1533.2689.2693.2697.'    2697                2697                '1.1533.2689.2693.2697.'
1580        '1.1580.'                   NULL                1581                '1.1580.1581.'
1581        '1.1580.1581.'              1581                1581                '1.1580.1581.'
1585        '1.1580.1581.1585.'         1585                1585                '1.1580.1581.1585.'
222         '1.222.'                    222                 222                 '1.222.'
223         '1.222.223.'                223                 223                 '1.222.223.'
224         '1.222.223.224.'            224                 224                 '1.222.223.224.'
3103        '1.3103.'                   NULL                3537                '1.3103.3529.3533.3537.'
3529        '1.3103.3529.'              NULL                3537                '1.3103.3529.3533.3537.'
3533        '1.3103.3529.3533.'         NULL                3537                '1.3103.3529.3533.3537.'
3537        '1.3103.3529.3533.3537.'    3537                3537                '1.3103.3529.3533.3537.'

上述查询的执行计划:

  |--Nested Loops(Left Semi Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1007]))
       |--Compute Scalar(DEFINE:([Expr1007]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1008]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1010]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
       |    |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))
       |--Table Spool
            |--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
                 |--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
                 |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))

重写的查询:

SELECT DISTINCT 
       sp.SiteID_i
  FROM [dbo].[SitePath_t] sp
     , [dbo].[SitePath_T] usp
     , [dbo].[UserSiteRight_T] uusr
 WHERE ( uusr.SiteID_i = usp.SiteID_i
   AND uusr.UserID_i = 2484
   AND usp.Path_v LIKE sp.Path_v+'%' )
 ORDER BY SiteID_i ASC

执行计划:

  |--Hash Match(Aggregate, HASH:([sp].[SiteID_i]))
       |--Nested Loops(Inner Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1006]))
            |--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
            |    |--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
            |    |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))
            |--Table Spool
                 |--Compute Scalar(DEFINE:([Expr1006]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1007]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1008]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
                      |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))

所有索引都已到位 - 数据库引擎优化顾问并未建议新的架构修改 - 但两个查询都在5秒内返回正确的结果 - 并且,因为它是Ajax请求的响应 - 感觉(并且是)更新导航树时非常慢

是否有任何建议来优化/修改数据库架构/索引/查询以获得更快的响应?

谢谢

4 个答案:

答案 0 :(得分:3)

基于:

SELECT sp.SiteID_i 
  FROM SitePath_t sp 
 WHERE EXISTS ( SELECT * 
              FROM [dbo].[SitePath_T] usp 
                 , [dbo].[UserSiteRight_T] uusr 
             WHERE uusr.SiteID_i = usp.SiteID_i 
               AND uusr.UserID_i = 2484 
               AND usp.Path_v LIKE sp.Path_v+'%' ) 

(根据您正在进行半连接的事实,这很好。)

首先在uusr表上集中(正确),找到该用户的记录。它已经在那里做了一个CIX Seek,这很好。从那里,它根据SiteID_i字段在usp中找到相应的记录。

接下来考虑一下这个事实,即它想要通过SiteID_i找到网站,以及您希望这种类型的加入。

合并加入怎么样?这样会很好,但需要在两边对数据进行排序。如果索引的顺序正确,那就没问题了......

...之后,你想找到基于Path的东西。那怎么样:

CREATE INDEX ix_UUSR on [dbo].[UserSiteRight_T] (UserID_i, SiteID_i);
CREATE INDEX ix_usp on [dbo].[SitePath_T] (SiteID_i) INCLUDE (Path_v);

然后在SitePath_T上的另一个索引找到您想要的SiteID:

CREATE INDEX ix_sp on [dbo].[SitePath_T] (Path_v) INCLUDE (SiteID_i);

最后一个可能会使用嵌套循环,但希望不会太糟糕。影响你的系统的东西将是前两个索引,它们可以让你看到EXISTS子句中两个表之间的合并连接。

答案 1 :(得分:0)

我会尝试在UserSiteRight_T表中的外键上添加索引 - 它们尚未编入索引,并且这些字段的索引应该加快查找速度:

CREATE NONCLUSTERED INDEX IX01_UserSiteRight
  ON UserSiteRight_T(UserID_i)

CREATE NONCLUSTERED INDEX IX02_UserSiteRight
  ON UserSiteRight_T(SiteID_i)  

并在您的SitePath_T表上:

CREATE NONCLUSTERED INDEX IX01_SitePath
  ON dbo.SitePath_T(SiteID_i)

尝试将这些放置到位,然后再次运行查询,并比较运行时间和执行计划 - 您是否看到任何改进?

这是一种常见的误解,但SQL Server 不会自动在外键列上添加索引(如SiteID_i上的SitePath_T),尽管普遍的共识是外键很有用,可能会加快参照完整性的执行速度,以及对这些外键的JOIN加速。

答案 2 :(得分:0)

在SitePath_T上自我加入以找到父母正在杀死你。也许您应该为ParentSiteID_i添加一列并使用正常的递归CTE?

然后它变成:

WITH Recurse_CTE AS (
  SELECT 
    us.SiteID_i
  , us.ParentSiteID_i
  , 0 AS RecurseDepth_i
  FROM dbo.SitePath_T us
  JOIN dbo.UserSiteRight_T uusr ON us.SiteID_i = uusr.SiteID_i
  WHERE uusr.UserID_i = 2484
  UNION ALL
  SELECT 
    us.SiteID_i
  , us.ParentSiteID_i
  , rcs.RecurseDepth_i+1 AS RecurseDepth_i
  FROM dbo.SitePath_T us
  JOIN Recurse_CTE rcs ON us.SiteID_i = rcs.ParentSiteID_i
  )
SELECT * FROM Recurse_CTE

在SitePath_T(ParentSiteID_i)上抛出一个索引,性能应该很快。

答案 3 :(得分:0)

我还要赞扬 Rob Farley ,以了解架构/算法并提供索引架构。

但问题不在于索引架构。

这是查询优化器!

在此后续帖子中有关于此的更多信息:SQL Server 2005 T-SQL Problem : Can you trust the Query Optimizer ? I know I can't !