查询性能WHERE子句包含IN(子查询)

时间:2011-05-11 20:39:34

标签: sql sqlperformance

SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade INNER JOIN
             TradeLine ON Trade.TradeId = TradeLine.TradeId
WHERE  (TradeLine.Id IN
                      (SELECT     PairOffId
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)))
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate

当表开始增长时,我担心WHERE子句中IN的性能。有没有人对这种查询有更好的策略?子查询返回的记录数比TradeLine表中的记录数慢得多。 TradeLine表本身以10 /天的速度增长。

谢谢。

编辑: 我使用了将子查询从WHERE移动到FROM的想法。我投票支持了这个新查询的所有答案。

   SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,   
          PairOff.Notional / 1000 AS Expr1
   FROM         Trade INNER JOIN
                  TradeLine ON Trade.TradeId = TradeLine.TradeId INNER JOIN
                      (SELECT     PairOffId, SUM(Notional) AS Notional
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                   HAVING (SUM(Notional) <> 0)) AS PairOff ON TradeLine.Id = PairOff.PairOffId
   ORDER BY Trade.Type, Trade.TradeDate

5 个答案:

答案 0 :(得分:6)

IN子句中的子查询不依赖于外部查询中的任何内容。您可以安全地将其移入FROM子句;一个理智的查询计划构建器会自动执行它。

此外,在生产中使用的任何查询上调用EXPLAIN PLAN是必须的。这样做,看看DBMS对此查询计划的看法。

答案 1 :(得分:2)

当子查询开始返回过大的结果集时,我是临时表的粉丝。

所以你的where条款就是

Where TradeLine.Id In (Select PairOffId From #tempResults)

#tempResults将被定义为(警告:语法来自内存,这意味着可能存在错误)

Select PairOffId Into #tempResults
From TradeLine
Where (TradeDate <= @TradeDate) 
  //I prefer params in case the query becomes a StoredProc
Group By PairOffId
Having (Sum(Notional) <> 0)

答案 2 :(得分:1)

我有2个建议你可以尝试:

1)。使用Exists,因为您不需要从子查询中获取数据,如下所示:

  

存在(从TradeLine中选择1)   AS TradeLine_1                  TradeLine.Id = TradeLine_1.PairOffId                       - 继续你的子查询......)

2)。主查询加入您的子查询,例如

  

...加入(your_subquery)   your_subquery.PairOffId = TradeLine.Id

我相信这两种方式可以比“In”操作获得更好的性能。

答案 3 :(得分:1)

我在XXXXXX数据库中遇到了数十万条记录的同样问题。在我的代码中,我想从所有节点检索层次结构(包含至少一个子节点的节点)。

enter image description here

写入的初始查询非常慢。

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js">
</script>
<select class="form-control selectOptions">
  <option>Default Option 1</option>
  <option selected>Default Option 2</option>
  <option>Default Option 3</option>
</select>

然后重写为

  SELECT SUPPLIER_ID, PARENT_SUPPLIER_ID,
  FROM SUPPLIER
  WHERE 
    SUPPLIER_ID != PARENT_SUPPLIER_ID
    OR 
    SUPPLIER_ID   IN
      (SELECT DISTINCT PARENT_SUPPLIER_ID
       FROM SUPPLIER
       WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
      );

答案 4 :(得分:-1)

使用IN基本上会强制您进行表扫描。当表增长时,执行时间会增加。此外,您还为返回的每条记录运行该查询。将标量选择更容易用作表格:

SELECT t.TradeId, t.Type, t.Symbol, t.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade t,
(SELECT     TradeId, PairOffID
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)) tl       
WHERE  t.TradeId = tl.TradeId
  and  t.id <> tl.PairOffID
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate