Oracle:整合行

时间:2015-10-07 12:34:24

标签: oracle

我偶然发现了一个问题,遗憾的是我无法解决问题。

情况: 我有一个数据库表,其中包含完整的数据历史记录,例如产品。每次产品的属性(字段)更改时,都会创建一个新记录并将其插入此表中。 现在,我只想看看状态变化事件。

样品:

PRODUCT_ID VALID_FROM          VALID_TO            STATUS
========   =================== =================== ======
08154711   09.07.2004 08:12:00 27.10.2005 08:00:00 STAT1
08154711   27.10.2005 08:01:00 24.05.2007 10:56:00 STAT1
08154711   24.05.2007 10:57:00 25.05.2007 12:20:00 STAT2
08154711   25.05.2007 12:21:00 30.05.2007 11:11:00 STAT2
08154711   30.05.2007 11:12:00 25.06.2007 09:49:00 STAT2
08154711   25.06.2007 09:50:00 25.06.2007 11:02:00 STAT1
08154711   25.06.2007 11:03:00 17.07.2007 09:28:00 STAT1
08154711   17.07.2007 09:29:00 02.09.2008 10:49:00 STAT1
08154711   02.09.2008 10:50:00 01.04.2010 07:56:00 STAT1
08154711   01.04.2010 07:57:00 06.04.2010 13:43:00 STAT2

结果应该是这样的:

PRODUCT_ID VALID_FROM          VALID_TO            STATUS
========   =================== =================== ======
08154711   09.07.2004 08:12:00 24.05.2007 10:56:00 STAT1
08154711   24.05.2007 10:57:00 25.06.2007 09:49:00 STAT2
08154711   25.06.2007 09:50:00 01.04.2010 07:56:00 STAT1
08154711   01.04.2010 07:57:00 06.04.2010 13:43:00 STAT2

含义只是合并所有“块”并为每个块取MIN(VALID_FROM)和MAX(VALID_TO)。

这对SQL来说是完全可能还是在函数内完成它是唯一能够被完成的方式?

提前致谢! 克里斯

3 个答案:

答案 0 :(得分:2)

假设对于给定的product_id,前一行的valid_to与下一行的valid_from之间没有间隙,那么您可以使用tabibitosan方法生成组:

with sample_data as (select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual),
     tabibitosan as (select product_id,
                            valid_from,
                            valid_to,
                            status,
                            row_number() over (partition by product_id order by valid_from)
                              - row_number() over (partition by product_id, status order by valid_from) grp
                     from   sample_data)
select   product_id,
         min(valid_from) valid_from,
         max(valid_to) valid_to,
         status
from     tabibitosan
group by product_id,
         status,
         grp
order by product_id,
         valid_from;


PRODUCT_ID VALID_FROM            VALID_TO              STATUS
---------- --------------------- --------------------- ------
08154711   09/07/2004 08:12:00   24/05/2007 10:56:00   STAT1 
08154711   24/05/2007 10:57:00   25/06/2007 09:49:00   STAT2 
08154711   25/06/2007 09:50:00   01/04/2010 07:56:00   STAT1 
08154711   01/04/2010 07:57:00   06/04/2010 13:43:00   STAT2 

以下是sstan答案的修改版本,我认为,这两个版本都符合OP的要求,并考虑到数据的中断:

with sample_data as (select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '11111111' product_id, to_date('10.07.2004 10:42:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('21.10.2005 14:35:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '11111111' product_id, to_date('21.10.2005 14:36:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.11.2005 16:18:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '11111111' product_id, to_date('01.01.2006 06:45:14','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('03.01.2006 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
                     select '11111111' product_id, to_date('03.01.2006 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:59:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '11111111' product_id, to_date('01.04.2010 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('04.07.2010 13:05:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
                     select '11111111' product_id, to_date('04.07.2010 13:06:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.09.2011 07:50:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual),
             res as (select product_id,
                            valid_from,
                            valid_to,
                            status,
                            lag(valid_to) over (partition by product_id order by valid_from) prev_valid_to,
                            lag(status) over (partition by product_id order by valid_from) as prev_status
                     from   sample_data),
       final_res as (select product_id,
                            valid_from,
                            valid_to,
                            status,
                            sum(case when valid_from - prev_valid_to = 1/(24*60)
                                            and status = prev_status
                                          then 0 
                                     else 1
                                end) over (partition by product_id order by valid_from) as grouping_id
                     from   res)
select   product_id,
         min(valid_from) as valid_from,
         max(valid_to) as valid_to,
         status
from     final_res
group by grouping_id,
         product_id,
         status
order by product_id,
         grouping_id;

PRODUCT_ID VALID_FROM            VALID_TO              STATUS
---------- --------------------- --------------------- ------
08154711   09/07/2004 08:12:00   24/05/2007 10:56:00   STAT1 
08154711   24/05/2007 10:57:00   25/06/2007 09:49:00   STAT2 
08154711   25/06/2007 09:50:00   01/04/2010 07:56:00   STAT1 
08154711   01/04/2010 07:57:00   06/04/2010 13:43:00   STAT2 
11111111   10/07/2004 10:42:00   24/11/2005 16:18:00   STAT1 
11111111   01/01/2006 06:45:14   03/01/2006 07:56:00   STAT1 
11111111   03/01/2006 07:57:00   04/07/2010 13:05:00   STAT2 
11111111   04/07/2010 13:06:00   01/09/2011 07:50:00   STAT1 

如果给定产品ID的行之间永远不会有任何差距,那么我建议原来的tabibitosan答案会更有效率,因为它只需要一组分析查询,而不是sstan中所需的两个。修改后的答案。

答案 1 :(得分:0)

我会做一些在你的帖子中没有立即明确的假设:

  • 行的顺序由valid_from值确定。
  • “blocks”被定义为具有相同product_id/status值组合的连续行。

如果是这样,以下查询使用lag和累积总和的组合来创建“blocks”,然后您可以group by来获取结果。

select product_id,
       min(valid_from) as valid_from,
       max(valid_to) as valid_to,
       status
  from (select product_id,
               valid_from,
               valid_to,
               status,
               sum(case when product_id = prev_product_id and status = prev_status
                        then 0 else 1 end) over (order by valid_from) as grouping_id
          from (select product_id,
                       lag(product_id) over (order by valid_from) as prev_product_id,
                       valid_from,
                       valid_to,
                       status,
                       lag(status) over (order by valid_from) as prev_status
                  from table_name))
 group by grouping_id, product_id, status
 order by grouping_id

答案 2 :(得分:0)

  WITH w_data AS (
           select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
           select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
           select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
           select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
           select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual
           ),
        w_grp1 as (
           select product_id, valid_from, valid_to, status,
                  case when status = lag(status) over (partition by product_id order by valid_from, valid_to)
                       then 0 else 1 end grps
             from w_data
           ),
        w_groups as (
           select product_id, valid_from, valid_to, status,
                   sum(grps) over (partition by product_id order by valid_from, valid_to) grp_id
             from w_grp1
           ),
        w_sub as (
           select product_id,
                    first_value(valid_from) over 
                       (partition by product_id, grp_id order by valid_from,      valid_to     ) valid_from,
                    first_value(valid_to)   over 
                       (partition by product_id, grp_id order by valid_from desc, valid_to desc) valid_to,
                    first_value(status)     over 
                       (partition by product_id, grp_id order by valid_from,      valid_to     ) status,
                    row_number() over (partition by product_id, grp_id order by valid_from, valid_to) rn
             from w_groups
           )
  select product_id, valid_from, valid_to, status
    from w_sub
   where rn = 1
  /

  PRODUCT_ VALID_FROM           VALID_TO             STATU
  -------- -------------------- -------------------- -----
  08154711 09-jul-2004 08:12:00 24-may-2007 10:56:00 STAT1
  08154711 24-may-2007 10:57:00 25-jun-2007 09:49:00 STAT2
  08154711 25-jun-2007 09:50:00 01-apr-2010 07:56:00 STAT1
  08154711 01-apr-2010 07:57:00 06-apr-2010 13:43:00 STAT2

  4 rows selected.

说明:

一次运行查询1“图层”以查看它正在做什么,它应该变得非常明显发生了什么。

w_data只是样本数据的设置。

w_grp1用于精确定位每个“组”的第一个成员..我假设当该记录的状态与前一个记录的状态(LAG)不同时。

w_groups当使用窗口总和分析来创建实际的“grp_id”时,我们可以在以后用作组。

w_sub只是一种“小计”逻辑,因为我从你想要的行中选择了值(即第一个valid_from日期,最后一个valid_to日期等) 请注意,使用FIRST_VALUE和DESC顺序选择最后一个日期;)(尝试使用LAST_VALUE,看看会发生什么......它将无法按照您想要的方式工作;)) 此外,此时,我们使用“rn”列对每个组的行进行编号,以便稍后我们可以选择第一个。

最终查询只会选出每个“组”的第一行..给你最终结果。