查询分区表中数据存在的有效方法

时间:2016-12-08 22:51:55

标签: sql performance plsql oracle11g partitioning

我在Oracle 11G企业版11.2.0.4.0

我有一个每个分区大约有12M行的表。分区是SnapshotDate

我需要评估最近15天的快照是否有任何数据。

网上找到最多的回答告诉我使用Row_Number() Over (Partition By SnapshotDate Order By SnapshotDate)`。这是我提出的代码(它只返回到目前为止具有值的日期,所以我需要在我的日历表中使用左连接):

;With OneDateAllDates As
(
  /* 
      partition by snapshot date so that numbering starts over again
      i have to use an order by - it gave me an error without one
  */
  Select SnapshotDate, 1 HasData, Row_Number() Over (Partition By SnapshotDate Order By SnapshotDate) RowNumber
  From FactTable
  Where SnapshotDate IN
  (
    /* any mechanism that gives me the last 10 calendar days will do*/
    Select CalendarTimeId
    From DimCalendar
    Where CalendarDate Between To_Date ('20161208', 'yyyymmdd') - 15 And To_Date ('20161208', 'yyyymmdd')
  )
)
Select *
From AllDates
Where RowNumber = 1;

然而,在15天内订购12M行是非常昂贵的 - 我排序1.8亿行以获得15行。这是我想要的输出:

Date          HasData
===========   =======
12/08/2016    1
12/07/2016    1
12/06/2016    0
12/05/2016    0
12/04/2016    1
12/03/2016    0
12/02/2016    1
12/01/2016    0    
etc etc

有没有更有效的方法来编写这样的查询?

1 个答案:

答案 0 :(得分:1)

我认为没有一种干净的方法来结合分区修剪和前N个报告。以下代码是丑陋和重复的,但它可以快速完成工作。

它会读取最近15个每日分区中的每一个,但rownum = 1会使其快速读取。日期可以用绑定变量替换,但是数字0到15必须是硬编码的。如果需要可变天数,则可以对数十个或数百个子查询进行硬编码,然后使用另一个绑定变量将其过滤掉。运行数百个子查询并不理想,但它仍然比读取1.8亿行快得多。

<强>查询

select CalendarDate, nvl(has_data, 0) has_data
from
(
    --The last 15 days.
    Select CalendarDate
    From DimCalendar
    Where CalendarDate Between To_Date ('20161208', 'yyyymmdd') - 15 And To_Date ('20161208', 'yyyymmdd')
) last_15_days
left join
(
    --The last 15 days of data, if any.
    select date '2016-12-08' - 0 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 0) and rownum = 1 union all
    select date '2016-12-08' - 1 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 1) and rownum = 1 union all
    select date '2016-12-08' - 2 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 2) and rownum = 1 union all
    select date '2016-12-08' - 3 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 3) and rownum = 1 union all
    select date '2016-12-08' - 4 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 4) and rownum = 1 union all
    select date '2016-12-08' - 5 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 5) and rownum = 1 union all
    select date '2016-12-08' - 6 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 6) and rownum = 1 union all
    select date '2016-12-08' - 7 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 7) and rownum = 1 union all
    select date '2016-12-08' - 8 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 8) and rownum = 1 union all
    select date '2016-12-08' - 9 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 9) and rownum = 1 union all
    select date '2016-12-08' - 10 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 10) and rownum = 1 union all
    select date '2016-12-08' - 11 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 11) and rownum = 1 union all
    select date '2016-12-08' - 12 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 12) and rownum = 1 union all
    select date '2016-12-08' - 13 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 13) and rownum = 1 union all
    select date '2016-12-08' - 14 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 14) and rownum = 1 union all
    select date '2016-12-08' - 15 the_date, 1 has_data from FactTable where SnapshotDate in (Select CalendarDate From DimCalendar Where CalendarDate = date '2016-12-08' - 15) and rownum = 1
) data_from_last_15_days
    on last_15_days.CalendarDate = data_from_last_15_days.the_date
order by CalendarDate desc;

测试架构

create table FactTable
(
    id number,
    SnapshotDate date
) nologging
partition by range (SnapshotDate)
interval (interval '1' day)
(
    partition p1 values less than (date '2000-01-01')
);

create table DimCalendar
(
    CalendarDate date
);

--Add last year into calendar.
insert into DimCalendar
select date '2016-01-01' + (level - 1)
from dual
connect by level <= 365;


--Insert 1.2 million rows per day.
begin
    for i in 1 .. 15 loop
        insert /*+ append */ into facttable select level, date '2016-12-01' + i from dual connect by level <= 1200000;
        commit;
    end loop;
end;
/