我在查询具有父/子关系的行的表时遇到问题。在编写简化示例时,我意识到stackexchange模式非常相似。
因此想象一下我通过stackexchange数据资源管理器查询stackoverflow posts表。我正在尝试获取所有帖子及其相关答案的子集。
帖子的子集在视图中定义,该视图具有相当复杂和昂贵的查询计划。在下面的例子中,它被简化为简单地选择前两行。
第一种方式,使用联合:
with ExpensiveView as (select top 2 ID from Posts order by ID)
select Posts.*
from ExpensiveView
left outer join Posts
ON ExpensiveView.Id = Posts.Id
union all
select Posts.*
from ExpensiveView
left outer join Posts
ON ExpensiveView.Id = Posts.ParentId
我非常想避免这种情况,因为ExpensiveView
被评估了两次。对于上面的简化版本显然不是问题,但会导致更复杂的问题。
第二种方式,使用带条件连接子句的单个选择:
with ExpensiveView as (select top 2 ID from Posts order by ID)
select Posts.*
from ExpensiveView
left outer join Posts
ON ExpensiveView.Id = Posts.Id or ExpensiveView.Id = Posts.ParentId
这可以避免ExpensiveView
被评估两次,但会导致一个非常大的聚簇索引扫描。它似乎是在ExpensiveView
中扫描每个ID的整个索引(所以2 * 14977623 =〜3,000万行)。这很慢。
两个问题
为什么第二个查询中的条件连接会导致如此大的索引扫描?
如果不对ExpensiveView
进行多次评估,我有什么办法可以获得我想要的结果吗?
答案 0 :(得分:0)
试试这个
with
ExpensiveView as (select top 2 ID from Posts order by ID),
CTE_Posts as (
select *, NP.Id as New_Post_ID
from Posts as P
outer apply (select P.Id union all select P.ParentId) as NP
)
select
P.*
from ExpensiveView as E
left outer join CTE_Posts as P on P.New_Post_ID = E.ID