Question

运行封装在另一个中的查询的最有效方法是什么？

1。在查询中查询：

select
  id_user
from table1
where
  foo in (select foo from table2 where dt_signin > '2014-01-01 00:00')

2。使用临时表：

create table tmp_table as
select
  foo
from table2
where
  dt_signin > '2014-01-01 00:00'

然后查询临时表

select
  id_user
from table1 t1
join tmp_table tmp
 on t1.foo = tmp.foo

使用method 1是where clause运行中的查询（表1的＃行）次或仅一次并存储在内存中以便与foo进行进一步比较？

Answer 1

exists版本经常优于其他版本

select id_user
from table1
where exists (
    select 1
    from table2
    where
        dt_signin > '2014-01-01 00:00'
        and
        table1.foo = foo
)

请注意，外部table1.foo与子查询中的foo进行比较

Answer 2

你的两个问题不会做同样的事情。要使第二种方法等效，您需要select distinct：

select distinct id_user
from table1 t1 join
     tmp_table tmp
     on t1.foo = tmp.foo;

由于这项额外的操作，我可能希望in表现更好。但是，通常情况下，当您遇到特定的性能问题时，应该根据系统上的数据对其进行测试。

至于你在问题中的疑问，有很多不同的方法可以解决这个问题。以下是一些：

带有in
两张包含join和distinct
两张in
存在两个表
带有exists
两张exists
CTE，join
CTE in
CTE exists

在理想的世界中，无论查询的表达方式如何，SQL编译器都会简单地研究查询并获得最佳的执行计划。那个世界不是我们生活的世界，唉。

临时表有时很有用（我更喜欢单一查询解决方案）的一个原因是出于优化目的：

有关临时表的统计信息是已知的，因此优化程序可以选择更好的计划。
您可以在临时表上构建索引以提高性能。

你的子查询不是很复杂，所以这些可能不是问题。

在不同的情况下，不同的方法可能会更好。默认情况下，我会在tmp_Table(dt_signin, foo)上构建索引并使用exists。

Answer 3

怎么样：

select
  id_user
from 
  table1 t1
  join (
    select foo from table2 where dt_signin > '2014-01-01 00:00'
  )  t2 ONt1.foo = t2.foo

如果可以使用连接，则无需创建临时表。

运行封装查询的最有效方法

3 个答案: