我在查询中遇到时间性能的巨大差异,看起来连接(内部和左外部)在查询中出现的顺序会产生重大影响。 加入的顺序是否存在一些“基本规则”?
它们都是更大查询的一部分。 它们之间的区别在于左连接在最快的查询中排在最后。
慢查询:(> 10分钟)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
查询速度更快:( ~30秒)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
答案 0 :(得分:9)
通常INNER JOIN顺序无关紧要,因为内连接是可交换的和关联的。在这两种情况下,你仍然有t0 inner join t4
所以应该没有区别。
重新说明,SQL是声明性的:你说“你想要什么”,而不是“如何”。优化器工作“如何”并将根据需要重新排序JOIN,在实践中也看起来像WHERE等。
在复杂查询中,基于成本的查询优化器不会耗尽所有排列,因此偶尔会有问题。
所以,我会检查这些:
查看其他一些SO问题:
答案 1 :(得分:1)
如果你有超过2个表,那么订购表连接很重要。它可以产生很大的差异。第一个表应该得到一个领先的提示。第一个表是具有最多选择行的对象。例如:如果您有一个拥有1.000.000人的成员表,并且您只想选择女性并且它是第一个表,那么您只能将500.000条记录加入下一个表。如果此表位于连接顺序的末尾(可能是表4,5或6),则每个记录(最差情况为1.000.000)将被连接。这包括内部和外部联接。
规则:从最具选择性的表开始,然后加入下一个逻辑最具选择性的表。
转换功能和美化应该持久。有时它会更好 将shole SQL捆绑在括号中,并在外部select语句中使用表达式和函数。
答案 2 :(得分:0)
至少在SQLite中,我发现它产生了巨大的差异。实际上,它并不需要是一个非常复杂的查询,以显示差异。然而,我的JOIN语句在嵌入式子句中。
基本上,你应该首先从最具体的限制开始,正如克里斯蒂安指出的那样。
答案 3 :(得分:0)
在左连接的情况下,它会影响很多性能。我在类似这样的选择查询中遇到了问题:
select distinct count(p0_.id) over () as col_0_0_,
p0_.id as col_1_0_,
p0_.lp as col_2_0_,
0
as col_3_0_,
max(coalesce(i6_.cft, i7_.rfo,
'')) as col_4_0_,
p0_.pdv as col_5_0_,
(s8_.qer)
as col_6_0_,
cf1_.ests as col_7_0_
from Produit p0_
left outer join CF cf1_ on p0_.fk_cf = cf1_.id
left outer join CA c2_ on cf1_.fk_ca = c2_.id
left outer join ml mt on c2_.fk_m = mt.id
left outer join sk s8_ on p0_.id = s8_.fk_p
left outer join rf r5_ on
rp4_.fk_r = r5_.id
left outer join
in i6_ on r5_.fk_ireftc = i6_.id
left outer join r_p_r rp4_ on p0_.id = rp4_.fk_p
left outer join
ir i7_ on r5_.fk_if = i7_.id
left outer join re_p_g gc9_ on p0_.id = gc9_.fk_p
left outer join gc g10_ on gc9_.fk_g = g10_.id
where
and (p0_.lC is null or p0_.lS = 'E')
and g10_.id is null or g10_.id
and r5_.fk_i is null
group by col_1_0_, col_2_0_, col_3_0_, col_5_0_, col_6_0_, col_7_0_
order by col_2_0_ asc, p0_.id
limit 10;
查询需要13到15秒的时间来执行,而当我更改顺序时,它需要1到2秒的时间。
select distinct count(p0_.id) over () as col_0_0_,
p0_.id as col_1_0_,
p0_.lp as col_2_0_,
0
as col_3_0_,
max(coalesce(i6_.cft, i7_.rfo,
'')) as col_4_0_,
p0_.pdv as col_5_0_,
(s8_.qer)
as col_6_0_,
cf1_.ests as col_7_0_
from Produit p0_
left outer join CF cf1_ on p0_.fk_cf = cf1_.id
left outer join sk s8_ on p0_.id = s8_.fk_p
left outer join r_p_r rp4_ on p0_.id = rp4_.fk_p
left outer join re_p_g gc9_ on p0_.id = gc9_.fk_p
left outer join CA c2_ on cf1_.fk_ca = c2_.id
left outer join ml mt on c2_.fk_m = mt.id
left outer join rf r5_ on
rp4_.fk_r = r5_.id
left outer join
in i6_ on r5_.fk_ireftc = i6_.id
left outer join
ir i7_ on r5_.fk_if = i7_.id
left outer join gc g10_ on gc9_.fk_g = g10_.id
where
and (p0_.lC is null or p0_.lS = 'E')
and(g10_.id is null
and r5_.fk_i is null
group by col_1_0_, col_2_0_, col_3_0_, col_5_0_, col_6_0_, col_7_0_
order by col_2_0_ asc, p0_.id
limit 10;
在我的情况下,我更改了顺序,以防在加载表时在随后的联接中使用使用该表的所有联接,而不是将其加载到另一个块中。就像在我的p0_表中一样,我在前4行中做了所有左连接,这与第一个代码不同。
PS:要在postgre中测试我的性能,我使用以下网站:http://tatiyants.com/pev/#/plans/new