如何通过连接消除昂贵的子选择

时间:2016-12-28 09:28:02

标签: sql oracle

任何人都可以帮我优化此查询。它有昂贵的子选择,它们加入了它们。

select distinct a.id1 
from xyz1 a, xyz2 b
where a.abc1 = 'ABC' 
and a.id1 = b.id1 
and b.abc2 = 10
and not exists(
  Select 1 
  from xyz3 c, xyz4 d
  where a.id1=c.id1
  and c.abc3 > 3
  and c.abc4 <> 7
  and c.id2 = d.id2
  and d.id3 in (1,11,111,2,22)
)
and not exists (
  Select 1 from xyz3 c ,xyz5 e
  where a.id1=c.id1
  and c.abc3 > 3
  and c.abc4 <> 7
  And c.id3=e.id4
  and e.id5 in (1,11,111,2,22)
);

我以为我可以用以下内容重新编写查询,但它会返回不同的行:

select distinct a.id1 
from xyz1 a, xyz2 b, xyz3 c,xyz4 d, xyz5 e
where a.abc1 = 'ABC' 
and a.id1 = b.id1 
and a.id1 = c.id1
and c.id3 = d.id2
and c.id3 = e.id4
and b.abc2 = 10
and not (
  c.abc3 > 3
  and c.abc4 <> 7
  and d.id3 in (1,11,111,2,22)
)
and not (
  c.abc3 > 3
  and c.abc4 <> 7
  and e.id5 in (1,11,111,2,22)
);

另外我想知道的是可以拉出子选择连接将它添加到主查询连接并将where条件赋予'而不是'并消除'not exists'子选择。

5 个答案:

答案 0 :(得分:0)

这可能会更快:

select a.id1 
from xyz1 a, xyz2 b
where a.abc1 = 'ABC' 
and a.id1 = b.id1 
and b.abc2 = 10
minus
select c.id1
from xyz3 c, xyz4 d
where c.abc3 > 3
and c.abc4 <> 7
and c.id2 = d.id2
and d.id3 in (1,11,111,2,22)
minus
select c.id1
from xyz3 c, xyz5 e
where c.abc3 > 3
and c.abc4 <> 7
And c.id3=e.id4
and e.id5 in (1,11,111,2,22)
;

答案 1 :(得分:0)

首先:让我们以非弃用的方式重写请求

SELECT DISTINCT
  a.id1
FROM
  xyz1 a
  INNER JOIN xyz2 b ON a.id1 = b.id1
  LEFT JOIN (
    SELECT c.id1 AS cid1
    FROM xyz3 c
      INNER JOIN xyz4 d ON c.id2 = d.id2
    WHERE c.abc3 > 3 AND c.abc4 <> 7 AND d.id3 IN (1,11,111,2,22)
  ) x ON a.id1 = x.cid1 
  LEFT JOIN (
    SELECT c.id1 AS cid1
    FROM xyz3 c
      INNER JOIN xyz5 e ON c.id3 = e.id4
    WHERE c.abc3 > 3 AND c.abc4 <> 7 AND e.id5 IN (1,11,111,2,22)
  ) y ON a.id1 = y.cid1
WHERE
  x.cid1 IS NULL AND y.cid1 IS NULL AND
  a.abc1 = 'ABC' AND b.abc2 = 10;

在这里,我们将了解旧的弃用语法。它不会提高性能,Oracle很可能会在请求处理期间将您的请求转换为与此请求密切相关的内容。


第二:尝试优化

WITH excluder AS (
  SELECT
    c.id1 AS cid1
  FROM xyz3 c
    LEFT JOIN xyz4 d ON c.id2 = d.id2 AND c.abc3 > 3 AND c.abc4 <> 7 AND d.id3 IN (1,11,111,2,22)
    LEFT JOIN xyz5 e ON c.id3 = e.id4 AND c.abc3 > 3 AND c.abc4 <> 7 AND e.id5 IN (1,11,111,2,22)
  WHERE
    d.id2 IS NULL OR e.id4 IS NULL
)
SELECT DISTINCT
  a.id1
FROM
  xyz1 a
  INNER JOIN excluder z ON a.id1 = z.cid1
  INNER JOIN xyz2 b ON a.id1 = b.id1
WHERE a.abc1 = 'ABC' AND b.abc2 = 10;
  1. 我们将de都链接到c(并非所有行,只是匹配where子句的行),以便排除所有不需要的结果。让我们为这个新的结果集z命名。
  2. 然后将za关联起来,仅保留未排除的结果。
  3. 我不是100%确定它能产生正确的输出,因为我不能自己测试它,但它通常应该。试一试

答案 2 :(得分:0)

在完全重写查询之前,请先尝试更改子查询类型的效果:

select distinct a.id1 
from xyz1 a, xyz2 b
where a.abc1 = 'ABC' 
and a.id1 = b.id1 
and b.abc2 = 10
and a.id1 not in (
  Select c.id1 
  from xyz3 c, xyz4 d
  where c.abc3 > 3
  and c.abc4 <> 7
  and c.id2 = d.id2
  and d.id3 in (1,11,111,2,22)
)
and a.id1 not in (
  Select c.id1 from xyz3 c ,xyz5 e
  where c.abc3 > 3
  and c.abc4 <> 7
  And c.id3=e.id4
  and e.id5 in (1,11,111,2,22)
);

现在您也可以尝试将两个子查询合并为一个:

select distinct a.id1 
from xyz1 a, xyz2 b
where a.abc1 = 'ABC' 
and a.id1 = b.id1 
and b.abc2 = 10
and a.id1 not in (
  Select c.id1 
  from xyz3 c, xyz4 d
  where c.abc3 > 3
  and c.abc4 <> 7
  and c.id2 = d.id2
  and d.id3 in (1,11,111,2,22)
  union
  Select c.id1 from xyz3 c ,xyz5 e
  where c.abc3 > 3
  and c.abc4 <> 7
  And c.id3=e.id4
  and e.id5 in (1,11,111,2,22)
);

这可能更快或更慢,只有您的测试可以显示。

答案 3 :(得分:0)

在这种情况下,您将“读取”xyz3表一次,  由于GROUP BYs,CBO将使用HASH而不是SORT作为不同的值,这比DISTINCT

更快
WITH xyz3_t as (SELECT /*+ materialize */ 
                       c.id1
                  FROM xyz3 c
                 WHERE 1 = 1
                   AND c.abc3 > 3
                   AND c.abc4 <> 7
                 GROUP BY c.id1   
                )

SELECT a.id1
    FROM xyz1 a
 INNER JOIN xyz2 b
    ON a.id1 = b.id1
  LEFT OUTER JOIN (SELECT c.id1
                    FROM xyz3_t c
                        ,xyz4 d
                   WHERE 1 = 1
                     AND c.abc3 > 3
                     AND c.abc4 <> 7
                     AND c.id2 = d.id2
                     AND d.id3 IN (1, 11, 111, 2, 22)
                   GROUP BY c.id1   
                     )   c
    ON a.id1 = c.id1                  

   LEFT OUTER JOIN (SELECT c.id1
                      FROM xyz3_t c
                          ,xyz5 e
                     WHERE 1 = 1
                       AND c.abc3 > 3
                       AND c.abc4 <> 7
                       AND c.id3 = e.id4
                       AND e.id5 IN (1, 11, 111, 2, 22)
                    )  e 
    ON a.id1 = e.id1                                                             

 WHERE a.abc1 = 'ABC'    
     AND b.abc2 = 10
     AND c.id1 IS NULL
     AND e.id1 IS NULL
GROUP BY a.id1           

答案 4 :(得分:0)

在不了解您的数据的情况下,您要求我们猜测。所以,这是我的猜测。它不一定比其他人好(例如,我喜欢@Martin Schapendonk&#39;} MINUS方法。

无论如何,这种方法做了一些事情:

1)用JOIN替换xyz2EXISTS,然后摆脱DISTINCT。我假设a.id1是唯一的,并且DISTINCT仅用于排除连接引入xyz2的重复项。通过使用EXISTS,您将无法在稍后的一堆重复行上执行NOT EXISTS子查询。

2)将两个NOT EXISTS子查询合并为一个。

SELECT a.id1
FROM   xyz1 a
WHERE  a.abc1 = 'ABC'
AND    EXISTS
         (SELECT 'b record'
          FROM   xyz2 b
          WHERE  b.id1 = a.id1
          AND    b.abc2 = 10)
AND    NOT EXISTS
         (SELECT 1
          FROM   xyz3 c
          WHERE  a.id1 = c.id1
          AND    c.abc3 > 3
          AND    c.abc4 <> 7
          AND    (EXISTS
                    (SELECT 'd record'
                     FROM   xyz4 d
                     WHERE  d.id2 = c.id2
                     AND    d.id3 IN (1,11,111,2,22))
          OR      EXISTS
                    (SELECT 'e record'
                     FROM   xyz5 e
                     WHERE  e.id4 = c.id3
                     AND    e.id5 IN (1,11,111,2,22))))