我有一个简单的查询,它一直在运行。有一个日期条件,一旦我删除,查询返回结果。它是一个格式为'31 -MAR-15'的日期字段。我不明白为什么这种情况会使查询变得如此缓慢。提前谢谢。
SELECT
substr(a.id, 1, 2) AS country,
count(DISTINCT a.id) AS id_count,
sum(a.amount) AS amount
FROM table1 a
JOIN table2 b ON a.id = b.id
JOIN table3 c ON b.party_id = c.party_id
WHERE a.prod_type = 'INS'
AND c.acct_type = 'LON'
AND substr(a.id, 1, 2) = 'US'
AND a.dump_dt = '31-MAR-15'
AND substr(id, 4, 8) = '20150303'
GROUP BY substr(a.id, 1, 2);
解释计划:
PLAN_TABLE_OUTPUT
Plan hash value: 255044277
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 121 | 125K (1)| 00:25:08 |
| 1 | HASH GROUP BY | | 1 | 121 | 125K (1)| 00:25:08 |
| 2 | VIEW | VW_DAG_0 | 1 | 121 | 125K (1)| 00:25:08 |
| 3 | HASH GROUP BY | | 1 | 98 | 125K (1)| 00:25:08 |
| 4 | NESTED LOOPS | | | | | |
| 5 | NESTED LOOPS | | 1 | 98 | 125K (1)| 00:25:08 |
| 6 | MERGE JOIN CARTESIAN | | 12613 | 800K| 21133 (2)| 00:04:14 |
|* 7 | TABLE ACCESS BY INDEX ROWID| TABLE1 | 1 | 45 | 46 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | DATA_DATE__STG_BACKUP2 | 1040 | | 6 (0)| 00:00:01 |
| 9 | BUFFER SORT | | 182K| 3564K| 21087 (2)| 00:04:14 |
|* 10 | TABLE ACCESS FULL | TABLE3 | 182K| 3564K| 21087 (2)| 00:04:14 |
|* 11 | INDEX RANGE SCAN | BSB_PARTYID_IDX | 22 | | 3 (0)| 00:00:01 |
|* 12 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 1 | 33 | 10 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7-filter(SUBSTR(A.ID, 4, 8) = '20150303' AND SUBSTR(A.ID, 1, 2) = 'US'
AND A.PROD_TYPE = 'INS')
8 - access(A.DUMP_DT = '31-MAR-15')
10 - filter(C.ACCT_TYPE = 'LON')
11 – access(B.PARTY_ID = C.PARTY_ID)
12 - filter(A.ID = B.ID)
答案 0 :(得分:1)
在TABLE1
上应用这4个谓词后,看起来优化器显着低估了返回的行数。
A.PROD_TYPE = 'INS'
SUBSTR(A.ID, 1, 2) = 'US'
A.DUMP_DT = '31-MAR-15'
SUBSTR(ID, 4, 8) = '20150303'
(稍微偏离主题:使用ANSI文字date '2015-03-31'
代替隐式转换后的字符串'31-MAR-15'
会更安全。而且语句有一些错误,比如错过前两个错误之间的条件谓词并错过最后一个谓词前面的A.
。)
首先,确保所有表格都有准确的统计数据,看看是否会改变解释计划:
begin
dbms_stats.gather_table_stats(user, 'TABLE1');
dbms_stats.gather_table_stats(user, 'TABLE2');
dbms_stats.gather_table_stats(user, 'TABLE3');
end;
/
“智能列”ID
使得很难估计应用条件后返回的行数。如果更改数据模型为时已晚,您至少可以为Oracle提供一些扩展统计信息来帮助它处理谓词:
select dbms_stats.create_extended_stats(user, 'TABLE1', '(SUBSTR(ID, 1, 2))') from dual;
select dbms_stats.create_extended_stats(user, 'TABLE1', '(SUBSTR(ID, 4, 8))') from dual;
我猜测SUBSTR(A.ID, 1, 2) = 'US'
是一个受欢迎的值,但如果没有扩展统计数据,Oracle就不会知道这一点。额外的直方图可以显着增加基数。然后优化器不会选择两个不相关的表之间的笛卡尔连接。
答案 1 :(得分:1)
我在WHERE子句中简化了 A.ID 字段
的条件A.ID LIKE 'US_20150303%'
与
具有相同的效果substr(a.id, 1, 2) = 'US' AND substr(id, 4, 8) = '20150303'
并且,如果列 A.ID 被编入索引,则应用 SUBSTR(a.ID,..)函数的事实会使索引无效。
另一方面, a.dump_dt 似乎是DATE类型列,因此在此列上应用过滤器的首选方法可能是
a.dump_dt = TO_DATE('31-MAR-15', 'DD-MON-RR')
而不是
a.dump_dt = '31-MAR-15'
后者主要取决于运行查询的Oracle客户端的NLS_DATE_FORMAT,并且在某些情况下可能会忽略使用 a.dump_dt 上的索引而对性能产生负面影响。
所以重写的查询如下所示:
SELECT
SUBSTR(A.ID, 1, 2) AS country,
COUNT(DISTINCT A.ID) AS id_count,
SUM(A.amount) AS amount
FROM table1 A
JOIN table2 b ON A.ID = b.ID
JOIN table3 c ON b.party_id = c.party_id
WHERE A.prod_type = 'INS'
AND c.acct_type = 'LON'
AND A.ID LIKE 'US_20150303%'
AND A.dump_dt = TO_DATE('31-MAR-15', 'DD-MON-RR')
GROUP BY SUBSTR(A.ID, 1, 2);
答案 2 :(得分:-1)
尝试使用oracle提示来稳定选择计划,或者您可以使用该技巧:
....
And A.DUMP_DT+0 = to_date('31-MAR-15','dd-mon- rr')
...