Question

我目前的问题是11g，但是我也对在以后的版本中如何更聪明地解决这个问题感兴趣。

我想加入两个表。表A有1000万行，表B很大，并且在大约1000个分区中有10亿条记录。一个分区大约有1000万条记录。我没有加入分区键。对于表A的大多数行，将在表B中找到一个或多个行。示例：

select * from table_a a
inner join table_b b on a.ref = b.ref

以上将返回约5000万行，而结果来自表b的约30个分区。我假设哈希联接在这里是正确的联接，即哈希表a和FTSing /索引扫描表b。

因此，无故扫描了970个分区。而且，我有第三个查询，可以告诉oracle要检查联接的30个分区。第三个查询的示例：

select partition_id from table_c

此查询正好给出了上面查询的30个分区。

我的问题：

在PL / SQL中，可以通过

解决此问题

将30个partition_ids选择为一个变量（可能只是select listagg(partition_id,',') ... into v_partitions from table_c

像这样执行我的查询：

execute immediate 'select * from table_a a 
inner join table_b b on a.ref = b.ref 
where b.partition_id in ('||v_partitions||')' into ...

假设此过程在10分钟内完成。

现在，如何使用纯SQL在相同的时间内完成？

只需编写

select * from table_a a
inner join table_b b on a.ref = b.ref 
where b.partition_id in (select partition_id from table_c)

似乎并没有解决问题，否则我可能瞄准了错误的计划。

我认为我想要的计划是

hash join
    table a
    nested loop
       table c
       partition pruning here
           table b

但是，这不会在10分钟内恢复。

那么，如何在SQL中执行此操作以及针对的执行计划？

我尚未尝试过的一种变体可能是解决方案

nested loop
   table c
   hash join
       table a
       partition pruning here (pushed predicate from the join to c)
            table b

我的另一种感觉是解决方案可能在于将表a连接到表c（虽然不确定如何），然后将结果连接到表b。

我不是要您为我输入所有内容。只是关于如何在SQL中执行此操作（从查询中获取分区限制）的一般概念-我应该针对什么计划？

非常感谢！彼得

Answer 1

我不是专家，但是我认为Oracle通常首先进行连接，然后再应用where条件。因此，您可以通过将分区修剪移至联接条件来获得所需的计划：

curl -k --digest --upload-file testfile.png https://www.directupload.net/index.php?mode=upload

我还看到人们尝试使用嵌入式视图来做这种事情：

select * from table_a a
inner join table_b b on a.ref = b.ref 
  and b.partition_id in (select partition_id from table_c);

Answer 2

谢谢大家与我就这一话题进行讨论。在我的情况下，这是通过（不是我自己）通过在table_c和table_a之间添加联接路径并通过重载联接条件来解决的，如下所示。在我的情况下，可以通过在table_a中添加partition_id列来实现：

select * from
  table_c c
  JOIN table_a a ON (a.partition_id = c.partition_id)
  JOIN table_b b ON (b.partition_id = c.partition_id and b.partition_id = a.partition_id and b.ref = a.ref)

这是您想要的计划：

leading(c,b,a) use_nl(c,b) swap_join_inputs(a) use_hash(a)

所以您得到：

hash join
    table a
    nested loop
       table c
       partition list iterator
           table b

来自第三张表的具有分区限制的哈希联接

2 个答案: