LEFT OUTER JOIN与NOT EXISTS的SQL性能

时间:2011-07-21 14:43:03

标签: sql sql-server

如果我想在表A中找到一组条目而不是在表B中,我可以使用LEFT OUTER JOIN或NOT EXISTS。我听说过SQL Server适用于ANSI,在某些情况下,LEFT OUTER JOIN比NOT EXISTS效率更高。在这种情况下,ANSI JOIN会表现得更好吗?并且在SQL Server上加入运算符比一般的NOT EXISTS更有效吗?

6 个答案:

答案 0 :(得分:59)

Joe的链接是一个很好的起点。 Quassnoi covers this too.

一般来说,如果您的字段已正确编入索引,或者您希望过滤掉更多记录(例如,在 EXIST 中有很多行子查询)NOT EXISTS表现更好。

EXISTSNOT EXISTS都短路 - 一旦记录符合条件,它就会被包含或过滤掉,优化器会移动到下一条记录。

LEFT JOIN将加入所有记录,无论它们是否匹配,然后过滤掉所有不匹配的记录。如果您的表格很大和/或您有多个JOIN条件,那么这可能会非常耗费资源。

我通常尽可能使用NOT EXISTSEXISTS。对于SQL Server,INNOT IN在语义上是等效的,可能更容易编写。 这些是SQL Server中唯一可以保证短路的运算符。

答案 1 :(得分:6)

我在SQL Server上阅读的有关此主题的最佳讨论是here

答案 2 :(得分:1)

就我个人而言,我认为这个有一个很大的旧版,“它取决于”。我见过每种方法都胜过另一种方法的情况。

你最好的选择是测试两者,看看哪个表现更好。如果这种情况下表格总是很小并且表现不那么重要,那么我会选择最适合你的人(对大多数人来说通常都是NOT EXISTS)并继续前进。

答案 3 :(得分:0)

blog entry提供了各种方式的示例( NOT IN 外部应用 LEFT OUTER JOIN EXCEPT NOT EXISTS )以获得相同的结果,并证明Not Exists(Left Anti Semi Join)是冷缓存和热缓存方案中的最佳选择。

答案 4 :(得分:0)

I've been wondering how we can use the index on the table we are deleting from in these cases that the OP describes.

Say we have:

 table EMPLOYEE (emp_id int, name varchar) 
and
 table EMPLOYEE_LOCATION (emp_id int, loc_id int)

In my real world example my tables are much wider and contain 1million + rows, I have simplified the schema for example purpose.

If I want to delete the rows from EMPLOYEE_LOCATION that don't have corresponding emp_id's in EMPLOYEE I can obviously use the Left outer technique or the NOT IN but I was wondering...

If both tables have indexes with leading column of emp_id then would it be worthwhile trying to use them?

Perhaps I could pull the emp_id's from EMPLOYEE, the emp_id's from EMPLOYEE_LOCATION into a temp table and get the emp_id's from the temp tables that I want to delete.

I could then cycle round these emp_id's and actually use the index like so:

loop for each emp_id X to delete -- (this would be a cursor)
 DELETE EMPLOYEE_LOCATION WHERE emp_id = X

I know there is overhead with the cursor but in my real example I am dealing with huge tables so I think explicitly using the index is desirable.

答案 5 :(得分:0)

  dba.stackexchange上的

Answer

我使用链接服务器时,NOT EXISTSLEFT JOIN ... WHERE IS NULL的优越感(但是很少)是个例外。

从检查执行计划开始,NOT EXISTS运算符似乎以嵌套循环方式执行。因此它是按行执行的(我认为这是有意义的)。

演示此行为的示例执行计划: enter image description here