查询优化:子查询中的JOIN

时间:2017-01-23 21:20:51

标签: sql performance join subquery query-optimization

给出以下查询:

SELECT [CustomerID], [CustomerName],[CustomerAddress],[CustomerPhone]
FROM Customers
WHERE CustomerID IN
    (SELECT CustomerID FROM Buys b 
    JOIN CarService cs ON cs.SoldCarNum=b.SoldCarNum        
    JOIN GarageWorkers gw ON cs.WorkerNum=gw.WorkerNum 
    WHERE YEAR(cs.ServiceDate) BETWEEN 2015 AND 2016
    GROUP BY CustomerID
    HAVING COUNT(DISTINCT gw.GarageID)>= 7)

有没有办法让它更有效率?我不喜欢JOIN,但我不知道如何摆脱它们。

修改

我使用的是Microsoft SQL Server。

1 个答案:

答案 0 :(得分:1)

我猜这是你的问题:

SELECT [CustomerID], [CustomerName], [CustomerAddress], [CustomerPhone]
FROM Customers c
WHERE c.CustomerID IN (SELECT b.CustomerID
                       FROM Buys b JOIN
                            CarService cs
                            ON cs.SoldCarNum = b.SoldCarNum JOIN    
                            GarageWorkers gw
                            ON cs.WorkerNum = gw.WorkerNum 
                       WHERE YEAR(cs.ServiceDate) BETWEEN 2015 AND 2016
                       GROUP BY CustomerID
                       HAVING COUNT(DISTINCT gw.GarageID) >=  7
                      );

这似乎是在2015年和2016年有七辆或更多辆汽车维修汽车的客户。首先,我会将日期比较更改为使用实际日期而不是YEAR()(因此可以使用索引)。这看起来像:

SELECT [CustomerID], [CustomerName], [CustomerAddress], [CustomerPhone]
FROM Customers c
WHERE c.CustomerID IN (SELECT b.CustomerID
                       FROM Buys b JOIN
                            CarService cs
                            ON cs.SoldCarNum = b.SoldCarNum JOIN    
                            GarageWorkers gw
                            ON cs.WorkerNum = gw.WorkerNum 
                       WHERE cs.ServiceDate >= '2015-01-01' AND
                             cs.ServiceDate < '2017-01-01'
                       GROUP BY b.CustomerID
                       HAVING COUNT(DISTINCT gw.GarageID) >=  7
                      );

接下来,您需要索引。我建议:

  • CarService(ServiceDate, SoldCarNum, WorkerNum)
  • Buys(SoldCarNum, CustomerId)
  • GarageWorkers(WorkerNum, GarageID)

这些索引&#34;覆盖&#34;子查询,意味着它们具有子查询中的所有列。

您没有提及数据库。在某些数据库中,将IN (<subquery>)替换为JOIN (<subquery>)通常也会提高效果。