在INNER JOIN中的多个子查询中优化重复的GROUP BY

时间:2013-10-25 09:46:51

标签: sql sql-server join

是的,所以我有这个笨重的查询,我需要优化,我修剪了很多,使其更具可读性,同时仍然得到了点。

我基本上看到在顶级查询和所有三个子查询中都有相同的“分组依据”逻辑,这些列也是“内部联接”逻辑的参数。问题是,我不确定如何优化它,虽然我可以想象必须有一些更简单的方法来实现相同的结果。

涉及的表:Invoice,InvoiceLine,ProductType

InvoiceLine通过外键与Invoice和ProductType相关

此查询应该将单个发票的总和invoiceline.click与所有其他发票的总计invoiceline.click进行比较,按producttype.name和invoiceline.origin分组,另外还可以通过invoice.final进行拆分。所以结果应该是这样的:

产品类型|起源|点击参考发票|点击所有其他已完成的发票|点击所有其他未定期发票

让我强调查询确实有效,但速度太慢。有关优化此事的任何提示吗?

DECLARE @startDate datetime;
DECLARE @endDate datetime;
DECLARE @refInvoiceGuid uniqueidentifier;

SET @startDate='2013-09-01 00:00:00';
SET @endDate='2013-09-30 23:59:59';
SET @refInvoiceGuid='34d03903-a2ad-49ae-bd72-e98b47cdbc52';

SELECT
    ProductType.Name,
    InvoiceLine.Origin,
    invRef.ClicksRef,
    invFinal.ClicksFinal,
    invNotFinal.ClicksNotFinal
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN (
        SELECT
            ProductType.Name AS ProductName,
            InvoiceLine.Origin AS Origin,
            SUM(InvoiceLine.Clicks) AS ClicksRef
        FROM InvoiceLine
        INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
        INNER JOIN Invoice ON Invoice.Id = InvoiceLine.Invoice_Ref
        WHERE
            InvoiceLine.BillingDate >= @startDate
            AND InvoiceLine.BillingDate <= @endDate
            AND Invoice.Guid = @refInvoiceGuid
        GROUP BY
            ProductType.Name, InvoiceLine.Origin
    ) invRef ON ProductType.Name = invRef.ProductName AND InvoiceLine.Origin = invRef.Origin
INNER JOIN (
        SELECT 
            ProductType.Name AS ProductName,
            InvoiceLine.Origin AS Origin,
            SUM(InvoiceLine.Clicks) AS ClicksFinal
        FROM InvoiceLine
        INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
        INNER JOIN Invoice ON Invoice.Id=InvoiceLine.Invoice_Ref AND Invoice.Final=1
        WHERE
            InvoiceLine.BillingDate >= @startDate
            AND InvoiceLine.BillingDate <= @endDate
            AND Invoice.Guid != @refInvoiceGuid
        GROUP BY
            ProductType.Name, InvoiceLine.Origin
    ) invFinal ON ProductType.Name = invFinal.ProductName AND InvoiceLine.Origin = invFinal.Origin
INNER JOIN (
        SELECT 
            ProductType.Name AS ProductName,
            InvoiceLine.Origin AS Origin,
            SUM(InvoiceLine.Clicks) AS ClicksNotFinal
        FROM InvoiceLine
        INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
        INNER JOIN Invoice ON Invoice.Id=InvoiceLine.Invoice_Ref AND Invoice.Final=0
        WHERE
            InvoiceLine.BillingDate >= @startDate
            AND InvoiceLine.BillingDate <= @endDate
            AND Invoice.Guid != @refInvoiceGuid
        GROUP BY
            ProductType.Name, InvoiceLine.Origin
    ) invNotFinal ON ProductType.Name = invNotFinal.ProductName AND InvoiceLine.Origin = invNotFinal.Origin
WHERE
    InvoiceLine.BillingDate >= @startDate
    AND InvoiceLine.BillingDate <= @endDate
GROUP BY
    ProductType.Name,
    InvoiceLine.Origin,
    invRef.ClicksRef,
    invFinal.ClicksFinal,
    invNotFinal.ClicksNotFinal

更新1

我添加了一个索引:

CREATE NONCLUSTERED INDEX [IX_ProductOrigin] ON InvoiceLine (Invoice_Ref,BillingDate) INCLUDE (Origin,Clicks,ProductType_Ref);

我已经将查询重写为更紧凑(显然这具有相同的性能):

SELECT
    ProductType.Name,
    InvoiceLine.Origin, 
    SUM(CASE WHEN Invoice.Guid = @refInvoiceGuid THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksRef,
    SUM(CASE WHEN Invoice.Guid <> @refInvoiceGuid AND Invoice.Final = 1 THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksFinal,
    SUM(CASE WHEN Invoice.Guid <> @refInvoiceGuid AND Invoice.Final = 0 THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksNotFinal
FROM InvoiceLine  
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN Invoice ON Invoice.id = InvoiceLine.Invoice_Ref
WHERE InvoiceLine.BillingDate >= @startDate AND InvoiceLine.BillingDate <= @endDate
GROUP BY ProductType.Name, InvoiceLine.Origin

它的速度似乎有所增加,而且可读性肯定会增加,但实际上它仍需要3分钟才能执行。以下是统计数据和执行计划:

Statistics and plan

因此,如果我正确理解统计数据,索引搜索只需要很长时间吗?关于如何改进这个的任何想法?或者我刚刚达到数据太多的地步?

以下是该指数的一些统计数据:

Index

表格:

Table

运行'set statistics io on'的结果:

(170 row(s) affected)
Table 'ProductType'. Scan count 0, logical reads 340, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'InvoiceLine'. Scan count 2741, logical reads 37444, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Invoice'. Scan count 1, logical reads 115, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

更新2

反转索引列的顺序后:

CREATE NONCLUSTERED INDEX [IX_ProductOrigin] ON InvoiceLine (BillingDate,Invoice_Ref) INCLUDE (Origin,Clicks,ProductType_Ref);

(170 row(s) affected)
Table 'ProductType'. Scan count 0, logical reads 340, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'InvoiceLine'. Scan count 1, logical reads 28371, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Invoice'. Scan count 1, logical reads 115, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

索引搜寻统计数据的执行计划:

Execution plan 2

这显着改善了性能(我们已经从170秒变为15秒)。谢谢你的帮助到目前为止。还有其他建议吗?

1 个答案:

答案 0 :(得分:0)

DECLARE @startDate datetime;
DECLARE @endDate datetime;
DECLARE @refInvoiceGuid uniqueidentifier;

SET @startDate='2013-09-01 00:00:00';
SET @endDate='2013-09-30 23:59:59';
SET @refInvoiceGuid='34d03903-a2ad-49ae-bd72-e98b47cdbc52';
select p.Name,il1.origin,SUM(il1.Clicks) as ClicksRef,SUM(il2.Clicks) as ClicksFinal,
SUM(il3.Clicks) as ClicksNotFinal
from Invoice i 
inner join InvoiceLine il1 on il1.Invoice_Ref = i.Id
and il1.BillingDate >= @startDate AND il1.BillingDate <= @endDate AND i.Guid =       
@refInvoiceGuid
left join InvoiceLine il2 on il2.Invoice_Ref = il1.Invoice_Ref  and il2.Final=1 and    
il2.BillingDate = il1.BillingDate 
left join InvoiceLine il3 on il3.Invoice_Ref = il2.Invoice_Ref  and il3.Final=0 and    
il3.BillingDate = il1.BillingDate 
INNER JOIN ProductType p on il1.ProductType_Ref = p.Id
group by p.Name,il1.origin