Question

我需要提高我的视图性能，现在生成视图的SQL是：

select tr.account_number , tr.actual_collection_trx_date ,s.customer_key
from   fct_collections_trx tr,
       stg_scd_customers_key s
where  tr.account_number = s.account_number
and    trunc(tr.actual_collection_trx_date) between s.start_date and s.end_date;

表fct_collections_trx有170k + - （每天更改）记录。

表stg_scd_customers_key有430个记录。

表fct_collections_trx的索引如下：（所有这些的单一索引）（ACCOUNT_NUMBER，SUB_ACCOUNT_NUMBER，ACTUAL_COLLECTION_TRX_DATE，COLLECTION_TRX_DATE，COLLECTION_ACTION_CODE）（UNIQUE）和ENTRY_SCHEMA_DATE（NORMAL）。 DDL：

alter table stg_admin.FCT_COLLECTIONS_TRX
add primary key (ACCOUNT_NUMBER, SUB_ACCOUNT_NUMBER, ACTUAL_COLLECTION_TRX_DATE, COLLECTION_TRX_DATE, COLLECTION_ACTION_CODE)
  using index 
  tablespace STG_COLLECTION_DATA
  pctfree 10
  initrans 2
  maxtrans 255
  storage
   (
    initial 80K
    next 1M
    minextents 1
    maxextents unlimited
  );

表格结构：

create table stg_admin.FCT_COLLECTIONS_TRX
(
  account_number              NUMBER(10) not null,
  sub_account_number          NUMBER(5) not null,
  actual_collection_trx_date  DATE not null,
  customer_key                NUMBER(10),
  sub_account_key             NUMBER(10),
  schema_key                  VARCHAR2(10) not null,
  collection_group_code       CHAR(3),
  collection_action_code      CHAR(3) not null,
  action_order                NUMBER,
  bucket                      NUMBER(5),
  collection_trx_date         DATE not null,
  days_into_cycle             NUMBER(5),
  logical_delete_date         DATE,
  balance                     NUMBER(10,2),
  abbrev                      CHAR(8),
  customer_status             CHAR(2),
  sub_account_status          CHAR(2),
  entry_schema_date           DATE,
  next_collection_action_code CHAR(3),
  next_collectin_trx_date     DATE,
  reject_key                  NUMBER(10) not null,
  dwh_update_date             DATE,
  delta_type                  VARCHAR2(1)
)

表stg_scd_customers_key有索引:(所有这些索引的单个索引）（ACCOUNT_NUMBER，START_DATE，END_DATE）。 DDL：

create unique index stg_admin.STG_SCD_CUST_KEY_PKP on stg_admin.STG_SCD_CUSTOMERS_KEY (ACCOUNT_NUMBER, START_DATE, END_DATE);

此表也已分区：

partition by range (END_DATE)
(
  partition SCD_CUSTOMERS_20081103 values less than (TO_DATE(' 2008-11-04 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
    tablespace FCT_CUSTOMER_SERVICES_DATA
    pctfree 10
    initrans 1
    maxtrans 255
    storage
    (
      initial 8M
      next 1M
      minextents 1
      maxextents unlimited
    )

表格结构：

create table stg_admin.STG_SCD_CUSTOMERS_KEY
(
  customer_key   NUMBER(18) not null,
  account_number NUMBER(10) not null,
  start_date     DATE not null,
  end_date       DATE not null,
  curr_ind       NUMBER(1) not null
)

我无法在大表上添加过滤器（需要所有日期范围），我无法使用物化视图。这个查询运行大约20-40分钟，我必须让它更快.. 我已经试图放弃截断，没有什么不同。

有什么建议吗？

解释计划：

Answer 1

首先，使用显式join语法编写查询：

select tr.account_number , tr.actual_collection_trx_date ,s.customer_key
from fct_collections_trx tr join
     stg_scd_customers_key s
     on tr.account_number = s.account_number and
        trunc(tr.actual_collection_trx_date) between s.start_date and s.end_date;

您已经为customers表提供了适当的索引。您可以在fct_collections_trx(account_number, trunc(actual_collection_trx_date), actual_collection_trx_date)上尝试索引。 Oracle可能会发现这对join很有用。

但是，如果您正在寻找单一匹配，那么我想知道是否有其他方法可行。以下查询的性能如何工作：

select tr.account_number , tr.actual_collection_trx_date,
       (select min(s.customer_key) keep (dense_rank first order by s.start_date desc)
        from stg_scd_customers_key s
        where tr.account_number = s.account_number and
              tr.actual_collection_trx_date >= s.start_date 
       ) as customer_key
from fct_collections_trx tr ;

此查询与原始查询不完全相同，因为它没有进行任何过滤 - 并且它没有检查结束日期。但有时候，这种措辞可能会更有效率。

此外，我认为trunc()在这种情况下是不必要的，因此stg_scd_customers_key(account_number, start_date, customer_key)上的索引是最佳的。

表达式min(x) keep (dense_rank first order by)基本上是first() - 它获取列表中的第一个元素。请注意，min()并不重要; max()同样适用。因此，此表达式获取满足where子句中条件的第一个客户密钥。我观察到这个函数在Oracle中非常快，并且通常比其他方法更快。

Answer 2

如果开始日期和结束日期没有时间元素（即它们都默认为午夜），那么你可以这样做：

select tr.account_number , tr.actual_collection_trx_date ,s.customer_key
from   fct_collections_trx tr,
       stg_scd_customers_key s
where  tr.account_number = s.account_number
and    tr.actual_collection_trx_date >= s.start_date
and    tr.actual_collection_trx_date < s.end_date + 1;

最重要的是，您可以为每个表添加一个索引，其中包含以下列：

for fct_collections_trx：（account_number，actual_collection_trx_date）
for stg_scd_customers_key：（account_number，start_date，end_date，customer_key）

这样，查询应该能够使用索引而不必去表。

Answer 3

我建议您根据案例中最具选择性的字段添加索引

 START_DATE, END_DATE

尝试恢复（或添加适当的）索引

 START_DATE, END_DATE, ACCOUNT_NUMBER

表stg_scd_customers_key

中的

改善视图性能

3 个答案: