加入时CTE非常慢

时间:2014-04-02 09:33:24

标签: sql sql-server-2008-r2 common-table-expression

我之前发过类似的东西,但我现在从另一个方向接近这个,所以我开了一个新问题。我希望这没关系。

我一直在与CTE合作,根据父母费用创建一笔费用。 SQL和详细信息可以在这里看到:

CTE Index recommendations on multiple keyed table

我不认为我在CTE上遗漏任何东西,但是当我使用一个大数据表(350万行)时,我遇到了问题。

tblChargeShare包含我需要的其他一些信息,例如InvoiceID,因此我将CTE放在视图vwChargeShareSubCharges中并将其加入到表中。

查询:

Select t.* from vwChargeShareSubCharges t
inner join 
tblChargeShare  s 
on t.CustomerID = s.CustomerID 
and t.MasterChargeID = s.ChargeID 
Where  s.ChargeID = 1291094

以几毫秒的速度返回结果。

查询:

Select ChargeID from tblChargeShare Where InvoiceID = 1045854

返回1行:

1291094

但查询:

Select t.* from vwChargeShareSubCharges t
inner join 
tblChargeShare  s 
on t.CustomerID = s.CustomerID 
and t.MasterChargeID = s.ChargeID 
Where  InvoiceID = 1045854

需要2-3分钟才能运行。

我保存了执行计划并将它们加载到SQL Sentry中。快速查询的树看起来像这样:

Fast Query

慢查询的计划是:

Slow Query

我尝试过重建索引,通过调优顾问程序和各种子查询组合运行查询。每当连接包含除PK之外的任何内容时,查询都很慢。

我在这里有一个类似的问题:

SQL Server Query time out depending on Where Clause

其中使用函数来执行子行的汇总而不是CTE。这是使用CTE重写以避免我现在遇到的同样问题。我已经阅读了答案中的答案,但我并不是更聪明 - 我阅读了一些有关提示和参数的信息,但我无法使其发挥作用。我以为使用CTE重写可以解决我的问题。在具有几千行的tblCharge上运行时查询很快。

在SQL 2008 R2和SQL 2012中测试

编辑:

我已将查询压缩为单个语句,但同样的问题仍然存在:

WITH RCTE AS
(
SELECT  ParentChargeId, s.ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,  ISNULL(s.TaxAmount, 0) as TaxAmount,  
ISNULL(s.DiscountAmount, 0) as DiscountAmount, s.CustomerID, c.ChargeID as MasterChargeID
from tblCharge c inner join tblChargeShare s
on c.ChargeID = s.ChargeID Where s.ChargeShareStatusID < 3 and ParentChargeID is NULL

UNION ALL

SELECT c.ParentChargeID, c.ChargeID, Lvl+1 AS Lvl, ISNULL(s.TotalAmount, 0),  ISNULL(s.TaxAmount, 0),  ISNULL(s.DiscountAmount, 0) , s.CustomerID 
, rc.MasterChargeID 
from tblCharge c inner join tblChargeShare s
on c.ChargeID = s.ChargeID
INNER JOIN RCTE rc ON c.PArentChargeID = rc.ChargeID and s.CustomerID = rc.CustomerID  Where s.ChargeShareStatusID < 3 
)

Select MasterChargeID as ChargeID, rcte.CustomerID, Sum(rcte.TotalAmount) as TotalCharged, Sum(rcte.TaxAmount) as TotalTax, Sum(rcte.DiscountAmount) as TotalDiscount
from RCTE inner join tblChargeShare s on rcte.ChargeID = s.ChargeID and RCTE.CustomerID = s.CustomerID 
Where InvoiceID = 1045854
Group by MasterChargeID, rcte.CustomerID 
GO

编辑: 更多的游戏,我只是不明白这一点。

此查询是即时的(2毫秒):

Select t.* from
vwChargeShareSubCharges t
Where  t.MasterChargeID = 1291094

这需要3分钟:

DECLARE @ChargeID int = 1291094

Select t.* from
vwChargeShareSubCharges t
Where  t.MasterChargeID = @ChargeID

即使我把大量数字放在&#34; In&#34;中,查询仍然是即时的:

Where  t.MasterChargeID in (1291090, 1291091, 1291092, 1291093,  1291094, 1291095, 1291096, 1291097, 1291098, 1291099, 129109)

编辑2:

我可以使用此示例数据从头开始复制:

我创建了一些虚拟数据来复制问题。它不是那么重要,因为我只添加了100,000行,但仍然发生了错误的执行计划(在SQLCMD模式下运行):

CREATE TABLE [tblChargeTest](
[ChargeID] [int] IDENTITY(1,1) NOT NULL,
[ParentChargeID] [int] NULL,
[TotalAmount] [money] NULL,
[TaxAmount] [money] NULL,
[DiscountAmount] [money] NULL,
[InvoiceID] [int] NULL,
CONSTRAINT [PK_tblChargeTest] PRIMARY KEY CLUSTERED 
(
[ChargeID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,     ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO

Insert into tblChargeTest
(discountAmount, TotalAmount, TaxAmount)
Select ABS(CHECKSUM(NewId())) % 10, ABS(CHECKSUM(NewId())) % 100, ABS(CHECKSUM(NewId())) % 10
GO 100000

Update tblChargeTest
Set ParentChargeID = (ABS(CHECKSUM(NewId())) % 60000) + 20000
Where ChargeID = (ABS(CHECKSUM(NewId())) % 20000)
GO 5000

CREATE VIEW [vwChargeShareSubCharges] AS
WITH RCTE AS
(
SELECT  ParentChargeId, ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,   ISNULL(TaxAmount, 0) as TaxAmount,  
ISNULL(DiscountAmount, 0) as DiscountAmount,  ChargeID as MasterChargeID
FROM tblChargeTest Where ParentChargeID is NULL

UNION ALL

SELECT rh.ParentChargeID, rh.ChargeID, Lvl+1 AS Lvl, ISNULL(rh.TotalAmount, 0),    ISNULL(rh.TaxAmount, 0),  ISNULL(rh.DiscountAmount, 0) 
, rc.MasterChargeID 
FROM tblChargeTest rh
INNER JOIN RCTE rc ON rh.PArentChargeID = rc.ChargeID --and rh.CustomerID =  rc.CustomerID 
)

Select MasterChargeID,  ParentChargeID, ChargeID, TotalAmount, TaxAmount, DiscountAmount , Lvl
FROM  RCTE r 
GO

然后运行这两个查询:

--Slow Query:
Declare @ChargeID int = 60900

Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = @ChargeID

--Fast Query:
Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = 60900

2 个答案:

答案 0 :(得分:14)

最好的SQL Server可以为您做的是将ChargeID上的过滤器向下推入视图内递归CTE的锚点部分。这允许搜索找到构建层次结构所需的唯一行。当您将参数作为常量值提供时,SQL Server可以对那些对此类事物感兴趣的人使用名为SelOnIterator的规则进行优化:

Pushed predicate with a constant value

使用局部变量时,它无法执行此操作,因此ChargeID上的谓词会卡在视图之外(从所有NULL ID开始构建完整层次结构):

Stuck Predicate

使用变量时获得最佳计划的一种方法是强制优化器在每次执行时编译新计划。然后,在执行时根据变量中的特定值定制生成的计划。这可以通过添加OPTION (RECOMPILE)查询提示来实现:

Declare @ChargeID int = 60900;

-- Produces a fast execution plan, at the cost of a compile on every execution
Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = @ChargeID
OPTION (RECOMPILE);

第二个选项是将视图更改为内联表函数。这允许您明确指定过滤谓词的位置:

CREATE FUNCTION [dbo].[udfChargeShareSubCharges]
(
    @ChargeID int
)
RETURNS TABLE AS RETURN
(
  WITH RCTE AS
  (
  SELECT  ParentChargeID, ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,   ISNULL(TaxAmount, 0) as TaxAmount,  
  ISNULL(DiscountAmount, 0) as DiscountAmount,  ChargeID as MasterChargeID
  FROM tblChargeTest 
  Where ParentChargeID is NULL 
  AND ChargeID = @ChargeID -- Filter placed here explicitly

  UNION ALL

  SELECT rh.ParentChargeID, rh.ChargeID, Lvl+1 AS Lvl, ISNULL(rh.TotalAmount, 0),    ISNULL(rh.TaxAmount, 0),  ISNULL(rh.DiscountAmount, 0) 
  , rc.MasterChargeID 
  FROM tblChargeTest rh
  INNER JOIN RCTE rc ON rh.ParentChargeID = rc.ChargeID --and rh.CustomerID =  rc.CustomerID 
  )

  Select MasterChargeID,  ParentChargeID, ChargeID, TotalAmount, TaxAmount, DiscountAmount , Lvl
  FROM  RCTE r 
)

像这样使用:

Declare @ChargeID int = 60900

select *
from dbo.udfChargeShareSubCharges(@ChargeID)

查询也可以从ParentChargeID上的索引中受益。

create index ix_ParentChargeID on tblChargeTest(ParentChargeID)

以下是类似场景中类似优化规则的另一个答案。 Optimizing Execution Plans for Parameterized T-SQL Queries Containing Window Functions

答案 1 :(得分:6)

首先,感谢您如此好地布置帖子。从你对你的问题的解释中学到了很多。

接下来要找到一个解决方案,我建议将SELECT INTO CTE放入e temp表并从那里加入。从加入CTE的个人经验来看,我的查询返回了5分钟,而只需将CTE生成的数据插入到临时表中,将其降低到4秒。我实际上是在一起加入两个CTE,但我想当CTE连接到LONG表(特别是外连接)时,这将适用于所有长时间运行的查询。


    --temp tables if needed to work with intermediate values
    If object_id('tempdb..#p') is not null
    drop table #p

    ;WITH cte as ( 
    select * from t1
    )

    select * 
    into #p
    from cte

    --then use the temp table as you would normally use the CTE
    select * from #p