SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,
SUM(TradeLine.Notional) / 1000 AS Expr1
FROM Trade INNER JOIN
TradeLine ON Trade.TradeId = TradeLine.TradeId
WHERE (TradeLine.Id IN
(SELECT PairOffId
FROM TradeLine AS TradeLine_1
WHERE (TradeDate <= '2011-05-11')
GROUP BY PairOffId
HAVING (SUM(Notional) <> 0)))
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate
当表开始增长时,我担心WHERE子句中IN的性能。有没有人对这种查询有更好的策略?子查询返回的记录数比TradeLine表中的记录数慢得多。 TradeLine表本身以10 /天的速度增长。
谢谢。
编辑: 我使用了将子查询从WHERE移动到FROM的想法。我投票支持了这个新查询的所有答案。
SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,
PairOff.Notional / 1000 AS Expr1
FROM Trade INNER JOIN
TradeLine ON Trade.TradeId = TradeLine.TradeId INNER JOIN
(SELECT PairOffId, SUM(Notional) AS Notional
FROM TradeLine AS TradeLine_1
WHERE (TradeDate <= '2011-05-11')
GROUP BY PairOffId
HAVING (SUM(Notional) <> 0)) AS PairOff ON TradeLine.Id = PairOff.PairOffId
ORDER BY Trade.Type, Trade.TradeDate
答案 0 :(得分:6)
IN
子句中的子查询不依赖于外部查询中的任何内容。您可以安全地将其移入FROM
子句;一个理智的查询计划构建器会自动执行它。
此外,在生产中使用的任何查询上调用EXPLAIN PLAN
是必须的。这样做,看看DBMS对此查询计划的看法。
答案 1 :(得分:2)
当子查询开始返回过大的结果集时,我是临时表的粉丝。
所以你的where
条款就是
Where TradeLine.Id In (Select PairOffId From #tempResults)
和#tempResults
将被定义为(警告:语法来自内存,这意味着可能存在错误)
Select PairOffId Into #tempResults
From TradeLine
Where (TradeDate <= @TradeDate)
//I prefer params in case the query becomes a StoredProc
Group By PairOffId
Having (Sum(Notional) <> 0)
答案 2 :(得分:1)
我有2个建议你可以尝试:
1)。使用Exists,因为您不需要从子查询中获取数据,如下所示:
存在(从TradeLine中选择1) AS TradeLine_1 TradeLine.Id = TradeLine_1.PairOffId - 继续你的子查询......)
2)。主查询加入您的子查询,例如
...加入(your_subquery) your_subquery.PairOffId = TradeLine.Id
我相信这两种方式可以比“In”操作获得更好的性能。
答案 3 :(得分:1)
我在XXXXXX数据库中遇到了数十万条记录的同样问题。在我的代码中,我想从所有节点检索层次结构(包含至少一个子节点的节点)。
写入的初始查询非常慢。
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js">
</script>
<select class="form-control selectOptions">
<option>Default Option 1</option>
<option selected>Default Option 2</option>
<option>Default Option 3</option>
</select>
然后重写为
SELECT SUPPLIER_ID, PARENT_SUPPLIER_ID,
FROM SUPPLIER
WHERE
SUPPLIER_ID != PARENT_SUPPLIER_ID
OR
SUPPLIER_ID IN
(SELECT DISTINCT PARENT_SUPPLIER_ID
FROM SUPPLIER
WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
);
答案 4 :(得分:-1)
使用IN基本上会强制您进行表扫描。当表增长时,执行时间会增加。此外,您还为返回的每条记录运行该查询。将标量选择更容易用作表格:
SELECT t.TradeId, t.Type, t.Symbol, t.TradeDate,
SUM(TradeLine.Notional) / 1000 AS Expr1
FROM Trade t,
(SELECT TradeId, PairOffID
FROM TradeLine AS TradeLine_1
WHERE (TradeDate <= '2011-05-11')
GROUP BY PairOffId
HAVING (SUM(Notional) <> 0)) tl
WHERE t.TradeId = tl.TradeId
and t.id <> tl.PairOffID
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate