我们有两个表:
create table table_x(
x_id varchar2(100) primary key
);
create table table_y(
x_id varchar2(100) references table_x(x_id),
stream varchar2(10),
val_a number,
val_b number
);
create index table_y_idx on table_y (x_id, stream);
假设每个表中有数百万行,table_y
每行包含0到10行x_id
。
以下示例中的查询按过滤器substr(x_id, 2, 1) = 'B'
返回200行。
需要优化查询:
QUERY 1
select
x.x_id,
y.val_a,
y.val_b
from table_x x
left join (select
x_id,
max(val_a) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_a,
max(val_b) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_b
from table_y
group by x_id
) y on x.x_id = y.x_id
where substr(x.x_id, 2, 1) = 'B'; -- intentionally not use the primary key filter
------
PLAN 1
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 2400000 | 22698 | 00:04:33 |
| * 1 | HASH JOIN OUTER | | 10000 | 2400000 | 22698 | 00:04:33 |
| * 2 | TABLE ACCESS FULL | TABLE_X | 10000 | 120000 | 669 | 00:00:09 |
| 3 | VIEW | | 10692 | 2437776 | 22029 | 00:04:25 |
| 4 | SORT GROUP BY | | 10692 | 245916 | 22029 | 00:04:25 |
| 5 | TABLE ACCESS FULL | TABLE_Y | 1069200 | 24591600 | 19359 | 00:03:53 |
----------------------------------------------------------------------------------
* 1 - access("X"."X_ID"="Y"."X_ID"(+))
* 2 - filter(SUBSTR("X"."X_ID", 2, 1)='B')
有一种重要优化方法,因此QUERY 2
返回行的速度比QUERY 1
快2-3倍。 INLINE
提示是非常重要的,因为没有它,第二个提示的速度和第一个一样慢。
QUERY 2
with
table_y_total as (
select --+ INLINE
x_id,
max(val_a) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_a,
max(val_b) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_b
from table_y
group by x_id
)
select
x.x_id,
(select val_a from table_y_total y where y.x_id = x.x_id) as val_a,
(select val_b from table_y_total y where y.x_id = x.x_id) as val_b
from table_x x
where substr(x.x_id, 2, 1) = 'B';
------
PLAN 2
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 120000 | 669 | 00:00:09 |
| 1 | SORT GROUP BY NOSORT | | 1 | 19 | 103 | 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID | TABLE_Y | 100 | 1900 | 103 | 00:00:02 |
| * 3 | INDEX RANGE SCAN | TABLE_Y_IDX | 100 | | 3 | 00:00:01 |
| 4 | SORT GROUP BY NOSORT | | 1 | 20 | 103 | 00:00:02 |
| 5 | TABLE ACCESS BY INDEX ROWID | TABLE_Y | 100 | 2000 | 103 | 00:00:02 |
| * 6 | INDEX RANGE SCAN | TABLE_Y_IDX | 100 | | 3 | 00:00:01 |
| * 7 | TABLE ACCESS FULL | TABLE_X | 10000 | 120000 | 669 | 00:00:09 |
-----------------------------------------------------------------------------------------
* 3 - access("X_ID"=:B1)
* 6 - access("X_ID"=:B1)
* 7 - filter(SUBSTR("X"."X_ID", 2, 1)='B')
由于第一个查询使用较少的代码重复,我宁愿保留它。
是否有提示或其他技巧来满足以下条件?
QUERY 1
)PLAN 2
)答案 0 :(得分:0)
也许您的代码过于简化了,但这并不是您想要的:
select y.x_id,
max(y.val_a) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_a,
max(y.val_b) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_b
from table_y y
where substr(y.x_id, 2, 1) = 'B'
group by x_id;
我不认为加入表格x是不必要的,因为你已经提出了问题。
答案 1 :(得分:0)
使用索引提示
select /*+index(index_name)*/ from table
答案 2 :(得分:0)
由于table_x
上的完全扫描是计划中最便宜的部分,因此在加入table_y
之前有一种过滤方法。虽然优化程序默认决定在table_y
上使用完整扫描,但使用index(y)
进行提示有助于将时间缩短到QUERY 2
的110%。
with
table_x_filtered as (
select x_id
from table_x
where substr(x_id, 2, 1) = 'B'
)
select /*+ index(y table_y_idx) */
x.x_id,
max(val_a) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_a,
max(val_b) KEEP (DENSE_RANK FIRST ORDER BY stream) as val_b
from table_x_filtered x
left join table_y y on y.x_id = x.x_id
group by x.x_id;