Oracle - 优化查询,大型数据库表,CLOB字段

时间:2013-08-08 01:14:32

标签: oracle clob

所以我一直在抨击我的大脑,并且承认对甲骨文不太好。我们有一个表格,可容纳大约6000万条记录,其中存储的值用于建筑物。在我认为合适的地方添加了适当的索引,但性能仍然不佳。以下是查询,应该有所帮助:

  SELECT count(*)
    FROM viewBuildings
   INNER JOIN tblValues
           ON viewBuildings.bldg_id = tblValues.bldg_id
   WHERE bldg_deleted = 0
     AND (bldg_summary = 1
         OR (bldg_root = 0 AND bldg_def = 0)
         OR bldg_parent = 1)
     AND field_id IN (207)
     AND UPPER(dbms_lob.substr(v_value, 2000, 1)) = UPPER('2320')

所以上面只是可以构造的查询的一个例子。它在v_value CLOB字段中查找tblValues以匹配'2320'。它可以搜索数值和文本值。 tblValues有6000万条记录。它由建筑物ID和字段ID索引。

我可能需要提供更多信息,但就统计数据而言,跳出来的数字是“一致的获取”。 Consistent gets = 74069.这是一个很大的数字吗?

任何建议都会很棒,主要是在处理大型数据库表上的CLOB字段时。无法使用上下文类型索引,因为我需要完全匹配,并且查找的数据可以是数字或字符串。

编辑(更多信息): tblBuildings是viewBuildings(视图)的一部分,有 80,000条记录 tblValues具有每个建筑物的值, 68,000,000条记录 tblValues每个建筑物大约有550个字段(field_id)

期望的结果:查询以返回结果< 5秒。这不合理吗?有时它会无限期地运行,其他时间可能会持续80秒。

解释计划结果

Plan hash value: 1480138519
-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------|
|   0 | SELECT STATEMENT                    |                                  |     1 |   192 |    32   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                     |                                  |     1 |   192 |            |          |
|   2 |   NESTED LOOPS                      |                                  |     1 |   192 |    15   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                     |                                  |     1 |   183 |    12   (0)| 00:00:01 |
|*  4 |     FILTER                          |                                  |       |       |            |          |
|   5 |      NESTED LOOPS OUTER             |                                  |     1 |    64 |    10   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID   | TBLBUILDINGS                     |     1 |    60 |     9   (0)| 00:00:01 |
|*  7 |        INDEX RANGE SCAN             | SAA_4                            |    17 |       |     3   (0)| 00:00:01 |
|   8 |         NESTED LOOPS                |                                  |     1 |    21 |     3   (0)| 00:00:01 |
|   9 |          TABLE ACCESS BY INDEX ROWID| TBLBUILDINGSTATUSES              |     1 |    15 |     2   (0)| 00:00:01 |
|* 10 |           INDEX RANGE SCAN          | IDX_BUILDINGSTATUS_EXCLUDEQUERY  |     1 |       |     1   (0)| 00:00:01 |
|* 11 |          INDEX RANGE SCAN           | IDX_BUILDING_STATUS_ASID_DELETED |     1 |     6 |     1   (0)| 00:00:01 |
|  12 |       TABLE ACCESS BY INDEX ROWID   | TBLBUILDINGSTATUSES              |     1 |     4 |     1   (0)| 00:00:01 |
|* 13 |        INDEX UNIQUE SCAN            | PK_TBLBUILDINGSTATUS             |     1 |       |     0   (0)| 00:00:01 |
|* 14 |     TABLE ACCESS BY INDEX ROWID     | TBLVALUES                        |     1 |   119 |     2   (0)| 00:00:01 |
|* 15 |      INDEX UNIQUE SCAN              | PK_SAA_6                         |     1 |       |     1   (0)| 00:00:01 |
|  16 |    INLIST ITERATOR                  |                                  |       |       |            |          |
|* 17 |     INDEX RANGE SCAN                | SAA_7                            |     1 |     9 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

   4 - filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS" IS NULL OR
              "TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS"=0)
   6 - filter("TBLBUILDINGS"."BLDG_SUMMARY"=1 OR "TBLBUILDINGS"."BLDG_SUB_BUILDING_PARENT"=1 OR
              "TBLBUILDINGS"."BLDG_BUILDING_DEF"=0 AND "TBLBUILDINGS"."BLDG_ROOT"=0)
   7 - access("TBLBUILDINGS"."BLDG_DELETED"=0)
       filter( NOT EXISTS (SELECT 0 FROM "TBLBUILDINGSTATUSES" "TBLBUILDINGSTATUSES","TBLBUILDINGS" "TBLBUILDINGS" WHERE
              "TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID" AND
              "TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1))
  10 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1)
  11 - access("TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
       filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
  13 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"(+)="TBLBUILDINGS"."BUILDING_STATUS_ID")
  14 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLVALUES"."V_VALUE",2000,1))=U'2320')
  15 - access("TBLVALUES"."FE_ID"=207 AND "TBLBUILDINGS"."BLDG_ID"="TBLVALUES"."BLDG_ID")
  17 - access("TBLINSPECTORBUILDINGMAP"."IN_ID"=1 AND ("TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=0 OR
              "TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=1) AND "TBLBUILDINGS"."BLDG_ID"="TBLINSPECTORBUILDINGMAP"."BLDG_ID")

 44 rows selected

Plan hash value: 2137789089

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| DISPLAY |  8168 | 16336 |    29   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

好的,我按照你的建议收集了统计信息,然后是plan_table_output。看起来像IDX_CURVAL_FE_ID这里有问题吗?这是字段id的值表的索引。

SQL_ID  d4aq8nsr1p6uw, child number 0
-------------------------------------
SELECT  /*+ gather_plan_statistics */ count(*)     FROM 
viewAssetsForUser1    INNER JOIN tblCurrentValues            ON 
viewAssetsForUser1.as_id = tblCurrentValues.as_id    WHERE as_deleted = 
:"SYS_B_0"      AND (as_summary = :"SYS_B_1"          OR (as_root = 
:"SYS_B_2" AND as_asset_def = :"SYS_B_3")          OR 
as_sub_asset_parent = :"SYS_B_4")      AND fe_id IN (:"SYS_B_5")      
AND UPPER(dbms_lob.substr(cv_value, :"SYS_B_6", :"SYS_B_7")) = 
UPPER(:"SYS_B_8")

Plan hash value: 4033422776

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                           |      1 |        |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   1 |  SORT AGGREGATE                       |                           |      1 |      1 |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  2 |   FILTER                              |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   3 |    NESTED LOOPS                       |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   4 |     NESTED LOOPS                      |                           |      1 |    115 |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  5 |      FILTER                           |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  6 |       HASH JOIN RIGHT OUTER           |                           |      1 |     82 |      0 |00:08:43.19 |   56589 |  56084 |  1348K|  1348K|  742K (0)|
|   7 |        TABLE ACCESS FULL              | TBLASSETSTATUSES          |      1 |      4 |      4 |00:00:00.01 |       3 |      0 |       |       |          |
|   8 |        NESTED LOOPS                   |                           |      1 |        |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|   9 |         NESTED LOOPS                  |                           |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 10 |          TABLE ACCESS BY INDEX ROWID  | TBLCURRENTVALUES          |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 11 |           INDEX RANGE SCAN            | IDX_CURVAL_FE_ID          |      1 |  16283 |  61357 |00:00:05.98 |     132 |    132 |       |       |          |
|* 12 |          INDEX RANGE SCAN             | SAA_1                     |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 13 |         TABLE ACCESS BY INDEX ROWID   | TBLASSETS                 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 14 |      INDEX UNIQUE SCAN                | PK_TBLINSPECTORBRIDGEMAP2 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 15 |     TABLE ACCESS BY GLOBAL INDEX ROWID| TBLINSPECTORASSETMAP      |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(:SYS_B_0=0)
   5 - filter(("TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS" IS NULL OR "TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS"=0))
   6 - access("TBLASSETSTATUSES"."ASSET_STATUS_ID"="TBLASSETS"."ASSET_STATUS_ID")
  10 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLCURRENTVALUES"."CV_VALUE",:SYS_B_6,:SYS_B_7))=SYS_OP_C2C(UPPER(:SYS_B_8)))
  11 - access("TBLCURRENTVALUES"."FE_ID"=:SYS_B_5)
  12 - access("TBLASSETS"."AS_DELETED"=:SYS_B_0 AND "TBLASSETS"."AS_ID"="TBLCURRENTVALUES"."AS_ID")
  13 - filter((("TBLASSETS"."AS_ROOT"=:SYS_B_2 AND "TBLASSETS"."AS_ASSET_DEF"=:SYS_B_3) OR "TBLASSETS"."AS_SUMMARY"=:SYS_B_1 OR 
              "TBLASSETS"."AS_SUB_ASSET_PARENT"=:SYS_B_4))
  14 - access("TBLASSETS"."AS_ID"="TBLINSPECTORASSETMAP"."AS_ID" AND "TBLINSPECTORASSETMAP"."IN_ID"=1)
  15 - filter(("TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=0 OR "TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=1))

2 个答案:

答案 0 :(得分:2)

您可以进行任意数量的优化,但最终会产生导致问题的大量数据。当您执行查询并在OEM上的性能图表上跟踪它时,您将花费大量时间在IO上。这就是将数据输入和输出内存。

解决方案是什么:它将对表进行分区。每当数据量很大时,您应该对表进行分区,以便只处理相关数据。 为了对表进行分区,您需要一些点来隔离数据并查看您的数据,它可以构建id。

您可以在此网址阅读更多相关信息:http://docs.oracle.com/cd/E11882_01/server.112/e25523/partition.htm#g471747

分区提供了许多其他功能,例如本地索引,有助于进一步优化查询。

如果您始终处理整个大型表数据,那么分区将不是解决方案,但这会在数据库架构上添加问号。

SO是查询优化会有所帮助,但是由于数据很大,您还应该评估表分区。

答案 1 :(得分:1)

糟糕的指数成本如果统计数据是新鲜的,并且优化程序的基数估算值相对较高,为什么会选择错误的计划呢?也许有一个参数使索引看起来人为地便宜。请看一下:select * from v$parameter where name in ('optimizer_index_cost_adj', 'optimizer_index_caching');它们是否与默认值100和0显着不同?

此外,请查看select * from sys.aux_stats$;也许您的系统统计信息使全表扫描看起来过于昂贵。某些版本的Oracle存在工作负载统计信息的错误,其中数字错误几个数量级。

或许您的表格非常庞大,16K索引读取是最佳访问路径。查看DBA_SEGMENTS.BYTES以查找表和LOB段的大小。

即使表格中等,并且计划更改为全表扫描,也可能无法在5秒内获得运行时间。但结合你的分区想法,这可能已经足够了。

LOB STORAGE 根据您的示例,我假设大多数CLOB都相对较小?也许你有一个不寻常的LOB设置浪费了很多空间,比如DISABLE STORAGE IN ROW。您可能需要查看表格DDL,或在此处发布所有内容。或者如果你可以用VARCHAR2替换CLOB,那就更好了。

FBI CLOB上基于函数的索引可能会显着加快速度。但它可能是一个非常大的索引:create index TBLCURRENTVALUES_FBI on TBLCURRENTVALUES(UPPER(dbms_lob.substr(v_value, 2000, 1)));

CURSOR_SHARING 查询正在改变一点,这使调整变得困难。看起来这个最新版本有CURSOR_SHARING=FORCE,这很不寻常。对于昂贵的查询,使用文字可能是一件好事 - 构建查询计划所花费的额外时间可能是值得的。如果系统参数无法更改,请查看提示/*+ cursor_sharing_exact */