'count(*)'查询在一秒内运行。没有'count(*)'的查询需要一个多小时。如何优化呢?

时间:2015-07-16 14:57:34

标签: oracle oracle12c

我的应用程序执行了两个查询,查询之间的唯一区别是'count(*)'列(在一个查询中我有它,在另一个查询中我没有)。

所有查询都是动态生成的,我们将软件提供给他们在数据库上运行的客户端(我们无法访问他们的数据库)。其中一个查询运行速度非常慢(等待几个小时后我无法完成)。 SQL Tuning Advisor建议接受sql配置文件,这有助于,但这意味着我必须告诉我们的客户端运行它,并接受计划。如果我们可以创建一个索引来加速查询,那会好得多。

以下是查询的内容:

select 
a.company_id
, count(*)
from 
 b
INNER JOIN  a
ON
b.company_id = a.company_id AND
b.sequence_num = a.sequence_num
INNER JOIN  c
ON
b.company_id = c.company_id AND
b.sequence_num = c.sequence_num
INNER JOIN  d 
ON
c.cash_receipt_num = d.cash_receipt_num
INNER JOIN  e
ON
e.code_list_id = 'CONSTANT'
where 
(a.company_id='123')
GROUP BY 
a.company_id
order by
a.company_id ASC

当查询具有'count()'时,它会在大约一秒钟内运行。没有'count()'的查询会在我杀死它之前运行几个小时,所以我从来没有完成它。

以下是每个表中的记录计数:

select count(*) from a -- 1,007,948
select count(*) from b -- 148,378
select count(*) from c -- 138,901
select count(*) from d -- 136,424
select count(*) from e -- 1

如果查询具有count(*)列,则返回的结果应为“123”,计数为“908,683”。

这是执行计划的样子:

--With count (fast):
--------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT  |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   2 |   NESTED LOOPS         |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS        |                   |     1 |    39 |     4   (0)| 00:00:01 |
|   4 |     NESTED LOOPS       |                   |     1 |    33 |     4   (0)| 00:00:01 |
|   5 |      NESTED LOOPS      |                   |     1 |    23 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN | e_KEY00           |     1 |     7 |     1   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL| c                 |     2 |    32 |     2   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN  | b_KEY00           |     1 |    10 |     1   (0)| 00:00:01 |
|*  9 |     INDEX UNIQUE SCAN  | d_KEY00           |     1 |     6 |     0   (0)| 00:00:01 |
|* 10 |    INDEX RANGE SCAN    | a_KEY00           |     1 |    10 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$3FA9081A
   6 - SEL$3FA9081A / e@SEL$4
   7 - SEL$3FA9081A / c@SEL$2
   8 - SEL$3FA9081A / b@SEL$1
   9 - SEL$3FA9081A / d@SEL$3
  10 - SEL$3FA9081A / a@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("e"."CODE_LIST_ID"='CONSTANT')
   7 - filter("c"."COMPANY_ID"='123')
   8 - access("b"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
   9 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")
  10 - access("a"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=1) '123'[3], COUNT(*)[22]
   2 - (#keys=0) 
   3 - (#keys=0) "b"."SEQUENCE_NUM"[NUMBER,22]
   4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "b"."SEQUENCE_NUM"[NUMBER,22]
   5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   8 - "b"."SEQUENCE_NUM"[NUMBER,22]

-- without count (slow)
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT    |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   2 |   NESTED LOOPS SEMI      |                   |     1 |    49 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS SEMI     |                   |     1 |    43 |     6   (0)| 00:00:01 |
|   4 |     NESTED LOOPS         |                   |     1 |    33 |     5   (0)| 00:00:01 |
|   5 |      NESTED LOOPS        |                   |     1 |    23 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN   | e_KEY00           |     1 |     7 |     1   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL  | c                 |     2 |    32 |     2   (0)| 00:00:01 |
|*  8 |      INDEX FAST FULL SCAN| a_KEY00           |     2 |    20 |     1   (0)| 00:00:01 |
|*  9 |     INDEX RANGE SCAN     | b_KEY00           |   139K|  1366K|     1   (0)| 00:00:01 |
|* 10 |    INDEX UNIQUE SCAN     | d_KEY00           |   136K|   799K|     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$3FA9081A
   6 - SEL$3FA9081A / e@SEL$4
   7 - SEL$3FA9081A / c@SEL$2
   8 - SEL$3FA9081A / a@SEL$1
   9 - SEL$3FA9081A / b@SEL$1
  10 - SEL$3FA9081A / d@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("e"."CODE_LIST_ID"='CONSTANT')
   7 - filter("d"."COMPANY_ID"='123')
   8 - filter("a"."COMPANY_ID"='123')
   9 - access("b"."COMPANY_ID"='123' AND 
              "b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")
       filter("b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
  10 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=1) '123'[3]
   2 - (#keys=0) 
   3 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22]
   4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22], 
       "a"."SEQUENCE_NUM"[NUMBER,22]
   5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22], 
       "c"."SEQUENCE_NUM"[NUMBER,22]
   8 - "a"."SEQUENCE_NUM"[NUMBER,22]

我怀疑这个问题与统计数据有关。我尝试运行以下内容:

begin 
DBMS_STATS.GATHER_SCHEMA_STATS (
ownname => 'owner_of_tables_here',
estimate_percent => 100
);
end;

EXEC dbms_stats.gather_database_stats;
EXEC dbms_stats.gather_database_stats(estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE, options => 'GATHER');

-- for each index mentioned in explain plan:
EXEC DBMS_STATS.GATHER_INDEX_STATS(ownname => 'owner_of_tables_here', indname => 'index name here', estimate_percent => 100)

-- for each of the five tables:
EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => 'owner_of_tables_here', tabname => 'table name here', estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE)

我错过了什么吗?客户端是否必须运行sql tuning advisor并接受建议的sql配置文件?

Oracle版本:12.1.0.2.0

解释查询的原因:从中进行此查询的应用程序允许用户从UI中选择列。例如,如果客户只想查看他们有权访问的所有公司,那么上面的查询就会运行,如果他们想要查看所有公司以及每个公司有多少条记录,那么执行count(*)查询。有一个“where company_id ='123'”的原因是因为这个特定用户只有权查看一家公司,但是不同的用户可能有权查看所有或多家公司,在这种情况下,动态生成的过滤器会有所不同。 (我理解查询看起来很奇怪,但通常查询会有很多列,而且没有'group by'子句 - 实际上运行得很快。)

1 个答案:

答案 0 :(得分:1)

有一些猜测,因为我没有确切的数据: 表A中的列COMPANY_ID可能有偏差。该表包含1M行,行数超过900K,其中company_id =' 123'

首先检查简化查询的执行计划

<img src="" />

如果它显示一些不切实际的低值,例如1或2,检查列COMPANY_ID是否有直方图

select * from a  where company_id = '123'

我希望没有。

收集此列的直方图,例如与

select HISTOGRAM from user_tab_columns where table_name = 'A' and COLUMN_NAME = 'COMPANY_ID';

检查简化查询的执行计划

exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'a',granularity=>'all',method_opt=>'FOR COLUMNS COMPANY_ID',estimate_percent => 100,cascade=>TRUE);

这应显示约900K行。我希望这个正确的基数将阻止将表A用在错误的位置(如在慢速计划中)。