所以我一直在抨击我的大脑,并且承认对甲骨文不太好。我们有一个表格,可容纳大约6000万条记录,其中存储的值用于建筑物。在我认为合适的地方添加了适当的索引,但性能仍然不佳。以下是查询,应该有所帮助:
SELECT count(*)
FROM viewBuildings
INNER JOIN tblValues
ON viewBuildings.bldg_id = tblValues.bldg_id
WHERE bldg_deleted = 0
AND (bldg_summary = 1
OR (bldg_root = 0 AND bldg_def = 0)
OR bldg_parent = 1)
AND field_id IN (207)
AND UPPER(dbms_lob.substr(v_value, 2000, 1)) = UPPER('2320')
所以上面只是可以构造的查询的一个例子。它在v_value CLOB字段中查找tblValues以匹配'2320'。它可以搜索数值和文本值。 tblValues有6000万条记录。它由建筑物ID和字段ID索引。
我可能需要提供更多信息,但就统计数据而言,跳出来的数字是“一致的获取”。 Consistent gets = 74069.这是一个很大的数字吗?
任何建议都会很棒,主要是在处理大型数据库表上的CLOB字段时。无法使用上下文类型索引,因为我需要完全匹配,并且查找的数据可以是数字或字符串。
编辑(更多信息): tblBuildings是viewBuildings(视图)的一部分,有 80,000条记录 tblValues具有每个建筑物的值, 68,000,000条记录 tblValues每个建筑物大约有550个字段(field_id)
期望的结果:查询以返回结果< 5秒。这不合理吗?有时它会无限期地运行,其他时间可能会持续80秒。
解释计划结果
Plan hash value: 1480138519
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------------|
| 0 | SELECT STATEMENT | | 1 | 192 | 32 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 192 | | |
| 2 | NESTED LOOPS | | 1 | 192 | 15 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 183 | 12 (0)| 00:00:01 |
|* 4 | FILTER | | | | | |
| 5 | NESTED LOOPS OUTER | | 1 | 64 | 10 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID | TBLBUILDINGS | 1 | 60 | 9 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | SAA_4 | 17 | | 3 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | 1 | 21 | 3 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| TBLBUILDINGSTATUSES | 1 | 15 | 2 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | IDX_BUILDINGSTATUS_EXCLUDEQUERY | 1 | | 1 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | IDX_BUILDING_STATUS_ASID_DELETED | 1 | 6 | 1 (0)| 00:00:01 |
| 12 | TABLE ACCESS BY INDEX ROWID | TBLBUILDINGSTATUSES | 1 | 4 | 1 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | PK_TBLBUILDINGSTATUS | 1 | | 0 (0)| 00:00:01 |
|* 14 | TABLE ACCESS BY INDEX ROWID | TBLVALUES | 1 | 119 | 2 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_SAA_6 | 1 | | 1 (0)| 00:00:01 |
| 16 | INLIST ITERATOR | | | | | |
|* 17 | INDEX RANGE SCAN | SAA_7 | 1 | 9 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
4 - filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS" IS NULL OR
"TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS"=0)
6 - filter("TBLBUILDINGS"."BLDG_SUMMARY"=1 OR "TBLBUILDINGS"."BLDG_SUB_BUILDING_PARENT"=1 OR
"TBLBUILDINGS"."BLDG_BUILDING_DEF"=0 AND "TBLBUILDINGS"."BLDG_ROOT"=0)
7 - access("TBLBUILDINGS"."BLDG_DELETED"=0)
filter( NOT EXISTS (SELECT 0 FROM "TBLBUILDINGSTATUSES" "TBLBUILDINGSTATUSES","TBLBUILDINGS" "TBLBUILDINGS" WHERE
"TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID" AND
"TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1))
10 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1)
11 - access("TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
13 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"(+)="TBLBUILDINGS"."BUILDING_STATUS_ID")
14 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLVALUES"."V_VALUE",2000,1))=U'2320')
15 - access("TBLVALUES"."FE_ID"=207 AND "TBLBUILDINGS"."BLDG_ID"="TBLVALUES"."BLDG_ID")
17 - access("TBLINSPECTORBUILDINGMAP"."IN_ID"=1 AND ("TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=0 OR
"TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=1) AND "TBLBUILDINGS"."BLDG_ID"="TBLINSPECTORBUILDINGMAP"."BLDG_ID")
44 rows selected
Plan hash value: 2137789089
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| DISPLAY | 8168 | 16336 | 29 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
好的,我按照你的建议收集了统计信息,然后是plan_table_output。看起来像IDX_CURVAL_FE_ID这里有问题吗?这是字段id的值表的索引。
SQL_ID d4aq8nsr1p6uw, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ count(*) FROM
viewAssetsForUser1 INNER JOIN tblCurrentValues ON
viewAssetsForUser1.as_id = tblCurrentValues.as_id WHERE as_deleted =
:"SYS_B_0" AND (as_summary = :"SYS_B_1" OR (as_root =
:"SYS_B_2" AND as_asset_def = :"SYS_B_3") OR
as_sub_asset_parent = :"SYS_B_4") AND fe_id IN (:"SYS_B_5")
AND UPPER(dbms_lob.substr(cv_value, :"SYS_B_6", :"SYS_B_7")) =
UPPER(:"SYS_B_8")
Plan hash value: 4033422776
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:08:43.19 | 56589 | 56084 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:08:43.19 | 56589 | 56084 | | | |
|* 2 | FILTER | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
| 3 | NESTED LOOPS | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
| 4 | NESTED LOOPS | | 1 | 115 | 0 |00:08:43.19 | 56589 | 56084 | | | |
|* 5 | FILTER | | 1 | | 0 |00:08:43.19 | 56589 | 56084 | | | |
|* 6 | HASH JOIN RIGHT OUTER | | 1 | 82 | 0 |00:08:43.19 | 56589 | 56084 | 1348K| 1348K| 742K (0)|
| 7 | TABLE ACCESS FULL | TBLASSETSTATUSES | 1 | 4 | 4 |00:00:00.01 | 3 | 0 | | | |
| 8 | NESTED LOOPS | | 1 | | 0 |00:08:43.19 | 56586 | 56084 | | | |
| 9 | NESTED LOOPS | | 1 | 163 | 0 |00:08:43.19 | 56586 | 56084 | | | |
|* 10 | TABLE ACCESS BY INDEX ROWID | TBLCURRENTVALUES | 1 | 163 | 0 |00:08:43.19 | 56586 | 56084 | | | |
|* 11 | INDEX RANGE SCAN | IDX_CURVAL_FE_ID | 1 | 16283 | 61357 |00:00:05.98 | 132 | 132 | | | |
|* 12 | INDEX RANGE SCAN | SAA_1 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 13 | TABLE ACCESS BY INDEX ROWID | TBLASSETS | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 14 | INDEX UNIQUE SCAN | PK_TBLINSPECTORBRIDGEMAP2 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
|* 15 | TABLE ACCESS BY GLOBAL INDEX ROWID| TBLINSPECTORASSETMAP | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(:SYS_B_0=0)
5 - filter(("TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS" IS NULL OR "TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS"=0))
6 - access("TBLASSETSTATUSES"."ASSET_STATUS_ID"="TBLASSETS"."ASSET_STATUS_ID")
10 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLCURRENTVALUES"."CV_VALUE",:SYS_B_6,:SYS_B_7))=SYS_OP_C2C(UPPER(:SYS_B_8)))
11 - access("TBLCURRENTVALUES"."FE_ID"=:SYS_B_5)
12 - access("TBLASSETS"."AS_DELETED"=:SYS_B_0 AND "TBLASSETS"."AS_ID"="TBLCURRENTVALUES"."AS_ID")
13 - filter((("TBLASSETS"."AS_ROOT"=:SYS_B_2 AND "TBLASSETS"."AS_ASSET_DEF"=:SYS_B_3) OR "TBLASSETS"."AS_SUMMARY"=:SYS_B_1 OR
"TBLASSETS"."AS_SUB_ASSET_PARENT"=:SYS_B_4))
14 - access("TBLASSETS"."AS_ID"="TBLINSPECTORASSETMAP"."AS_ID" AND "TBLINSPECTORASSETMAP"."IN_ID"=1)
15 - filter(("TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=0 OR "TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=1))
答案 0 :(得分:2)
您可以进行任意数量的优化,但最终会产生导致问题的大量数据。当您执行查询并在OEM
上的性能图表上跟踪它时,您将花费大量时间在IO上。这就是将数据输入和输出内存。
解决方案是什么:它将对表进行分区。每当数据量很大时,您应该对表进行分区,以便只处理相关数据。 为了对表进行分区,您需要一些点来隔离数据并查看您的数据,它可以构建id。
您可以在此网址阅读更多相关信息:http://docs.oracle.com/cd/E11882_01/server.112/e25523/partition.htm#g471747
分区提供了许多其他功能,例如本地索引,有助于进一步优化查询。
如果您始终处理整个大型表数据,那么分区将不是解决方案,但这会在数据库架构上添加问号。
SO是查询优化会有所帮助,但是由于数据很大,您还应该评估表分区。
答案 1 :(得分:1)
糟糕的指数成本如果统计数据是新鲜的,并且优化程序的基数估算值相对较高,为什么会选择错误的计划呢?也许有一个参数使索引看起来人为地便宜。请看一下:select * from v$parameter where name in ('optimizer_index_cost_adj', 'optimizer_index_caching');
它们是否与默认值100和0显着不同?
此外,请查看select * from sys.aux_stats$;
也许您的系统统计信息使全表扫描看起来过于昂贵。某些版本的Oracle存在工作负载统计信息的错误,其中数字错误几个数量级。
或许您的表格非常庞大,16K索引读取是最佳访问路径。查看DBA_SEGMENTS.BYTES
以查找表和LOB段的大小。
即使表格中等,并且计划更改为全表扫描,也可能无法在5秒内获得运行时间。但结合你的分区想法,这可能已经足够了。
LOB STORAGE 根据您的示例,我假设大多数CLOB都相对较小?也许你有一个不寻常的LOB设置浪费了很多空间,比如DISABLE STORAGE IN ROW
。您可能需要查看表格DDL,或在此处发布所有内容。或者如果你可以用VARCHAR2替换CLOB,那就更好了。
FBI CLOB上基于函数的索引可能会显着加快速度。但它可能是一个非常大的索引:create index TBLCURRENTVALUES_FBI on TBLCURRENTVALUES(UPPER(dbms_lob.substr(v_value, 2000, 1)));
CURSOR_SHARING 查询正在改变一点,这使调整变得困难。看起来这个最新版本有CURSOR_SHARING=FORCE
,这很不寻常。对于昂贵的查询,使用文字可能是一件好事 - 构建查询计划所花费的额外时间可能是值得的。如果系统参数无法更改,请查看提示/*+ cursor_sharing_exact */
。