我偶然发现了一个问题,遗憾的是我无法解决问题。
情况: 我有一个数据库表,其中包含完整的数据历史记录,例如产品。每次产品的属性(字段)更改时,都会创建一个新记录并将其插入此表中。 现在,我只想看看状态变化事件。
样品:
PRODUCT_ID VALID_FROM VALID_TO STATUS ======== =================== =================== ====== 08154711 09.07.2004 08:12:00 27.10.2005 08:00:00 STAT1 08154711 27.10.2005 08:01:00 24.05.2007 10:56:00 STAT1 08154711 24.05.2007 10:57:00 25.05.2007 12:20:00 STAT2 08154711 25.05.2007 12:21:00 30.05.2007 11:11:00 STAT2 08154711 30.05.2007 11:12:00 25.06.2007 09:49:00 STAT2 08154711 25.06.2007 09:50:00 25.06.2007 11:02:00 STAT1 08154711 25.06.2007 11:03:00 17.07.2007 09:28:00 STAT1 08154711 17.07.2007 09:29:00 02.09.2008 10:49:00 STAT1 08154711 02.09.2008 10:50:00 01.04.2010 07:56:00 STAT1 08154711 01.04.2010 07:57:00 06.04.2010 13:43:00 STAT2
结果应该是这样的:
PRODUCT_ID VALID_FROM VALID_TO STATUS ======== =================== =================== ====== 08154711 09.07.2004 08:12:00 24.05.2007 10:56:00 STAT1 08154711 24.05.2007 10:57:00 25.06.2007 09:49:00 STAT2 08154711 25.06.2007 09:50:00 01.04.2010 07:56:00 STAT1 08154711 01.04.2010 07:57:00 06.04.2010 13:43:00 STAT2
含义只是合并所有“块”并为每个块取MIN(VALID_FROM)和MAX(VALID_TO)。
这对SQL来说是完全可能还是在函数内完成它是唯一能够被完成的方式?
提前致谢! 克里斯
答案 0 :(得分:2)
假设对于给定的product_id,前一行的valid_to与下一行的valid_from之间没有间隙,那么您可以使用tabibitosan方法生成组:
with sample_data as (select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual),
tabibitosan as (select product_id,
valid_from,
valid_to,
status,
row_number() over (partition by product_id order by valid_from)
- row_number() over (partition by product_id, status order by valid_from) grp
from sample_data)
select product_id,
min(valid_from) valid_from,
max(valid_to) valid_to,
status
from tabibitosan
group by product_id,
status,
grp
order by product_id,
valid_from;
PRODUCT_ID VALID_FROM VALID_TO STATUS
---------- --------------------- --------------------- ------
08154711 09/07/2004 08:12:00 24/05/2007 10:56:00 STAT1
08154711 24/05/2007 10:57:00 25/06/2007 09:49:00 STAT2
08154711 25/06/2007 09:50:00 01/04/2010 07:56:00 STAT1
08154711 01/04/2010 07:57:00 06/04/2010 13:43:00 STAT2
以下是sstan答案的修改版本,我认为,这两个版本都符合OP的要求,并考虑到数据的中断:
with sample_data as (select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '11111111' product_id, to_date('10.07.2004 10:42:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('21.10.2005 14:35:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '11111111' product_id, to_date('21.10.2005 14:36:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.11.2005 16:18:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '11111111' product_id, to_date('01.01.2006 06:45:14','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('03.01.2006 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '11111111' product_id, to_date('03.01.2006 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:59:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '11111111' product_id, to_date('01.04.2010 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('04.07.2010 13:05:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '11111111' product_id, to_date('04.07.2010 13:06:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.09.2011 07:50:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual),
res as (select product_id,
valid_from,
valid_to,
status,
lag(valid_to) over (partition by product_id order by valid_from) prev_valid_to,
lag(status) over (partition by product_id order by valid_from) as prev_status
from sample_data),
final_res as (select product_id,
valid_from,
valid_to,
status,
sum(case when valid_from - prev_valid_to = 1/(24*60)
and status = prev_status
then 0
else 1
end) over (partition by product_id order by valid_from) as grouping_id
from res)
select product_id,
min(valid_from) as valid_from,
max(valid_to) as valid_to,
status
from final_res
group by grouping_id,
product_id,
status
order by product_id,
grouping_id;
PRODUCT_ID VALID_FROM VALID_TO STATUS
---------- --------------------- --------------------- ------
08154711 09/07/2004 08:12:00 24/05/2007 10:56:00 STAT1
08154711 24/05/2007 10:57:00 25/06/2007 09:49:00 STAT2
08154711 25/06/2007 09:50:00 01/04/2010 07:56:00 STAT1
08154711 01/04/2010 07:57:00 06/04/2010 13:43:00 STAT2
11111111 10/07/2004 10:42:00 24/11/2005 16:18:00 STAT1
11111111 01/01/2006 06:45:14 03/01/2006 07:56:00 STAT1
11111111 03/01/2006 07:57:00 04/07/2010 13:05:00 STAT2
11111111 04/07/2010 13:06:00 01/09/2011 07:50:00 STAT1
如果给定产品ID的行之间永远不会有任何差距,那么我建议原来的tabibitosan答案会更有效率,因为它只需要一组分析查询,而不是sstan中所需的两个。修改后的答案。
答案 1 :(得分:0)
我会做一些在你的帖子中没有立即明确的假设:
valid_from
值确定。product_id/status
值组合的连续行。如果是这样,以下查询使用lag
和累积总和的组合来创建“blocks”,然后您可以group by
来获取结果。
select product_id,
min(valid_from) as valid_from,
max(valid_to) as valid_to,
status
from (select product_id,
valid_from,
valid_to,
status,
sum(case when product_id = prev_product_id and status = prev_status
then 0 else 1 end) over (order by valid_from) as grouping_id
from (select product_id,
lag(product_id) over (order by valid_from) as prev_product_id,
valid_from,
valid_to,
status,
lag(status) over (order by valid_from) as prev_status
from table_name))
group by grouping_id, product_id, status
order by grouping_id
答案 2 :(得分:0)
WITH w_data AS (
select '08154711' product_id, to_date('09.07.2004 08:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('27.10.2005 08:00:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('27.10.2005 08:01:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('24.05.2007 10:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('24.05.2007 10:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.05.2007 12:20:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.05.2007 12:21:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('30.05.2007 11:11:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('30.05.2007 11:12:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 09:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual union all
select '08154711' product_id, to_date('25.06.2007 09:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('25.06.2007 11:02:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('25.06.2007 11:03:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('17.07.2007 09:28:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('17.07.2007 09:29:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('02.09.2008 10:49:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('02.09.2008 10:50:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('01.04.2010 07:56:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT1' status from dual union all
select '08154711' product_id, to_date('01.04.2010 07:57:00','dd.mm.yyyy hh24:mi:ss') valid_from, to_date('06.04.2010 13:43:00','dd.mm.yyyy hh24:mi:ss') valid_to, 'STAT2' status from dual
),
w_grp1 as (
select product_id, valid_from, valid_to, status,
case when status = lag(status) over (partition by product_id order by valid_from, valid_to)
then 0 else 1 end grps
from w_data
),
w_groups as (
select product_id, valid_from, valid_to, status,
sum(grps) over (partition by product_id order by valid_from, valid_to) grp_id
from w_grp1
),
w_sub as (
select product_id,
first_value(valid_from) over
(partition by product_id, grp_id order by valid_from, valid_to ) valid_from,
first_value(valid_to) over
(partition by product_id, grp_id order by valid_from desc, valid_to desc) valid_to,
first_value(status) over
(partition by product_id, grp_id order by valid_from, valid_to ) status,
row_number() over (partition by product_id, grp_id order by valid_from, valid_to) rn
from w_groups
)
select product_id, valid_from, valid_to, status
from w_sub
where rn = 1
/
PRODUCT_ VALID_FROM VALID_TO STATU
-------- -------------------- -------------------- -----
08154711 09-jul-2004 08:12:00 24-may-2007 10:56:00 STAT1
08154711 24-may-2007 10:57:00 25-jun-2007 09:49:00 STAT2
08154711 25-jun-2007 09:50:00 01-apr-2010 07:56:00 STAT1
08154711 01-apr-2010 07:57:00 06-apr-2010 13:43:00 STAT2
4 rows selected.
说明:
一次运行查询1“图层”以查看它正在做什么,它应该变得非常明显发生了什么。
w_data只是样本数据的设置。
w_grp1用于精确定位每个“组”的第一个成员..我假设当该记录的状态与前一个记录的状态(LAG)不同时。
w_groups当使用窗口总和分析来创建实际的“grp_id”时,我们可以在以后用作组。
w_sub只是一种“小计”逻辑,因为我从你想要的行中选择了值(即第一个valid_from日期,最后一个valid_to日期等) 请注意,使用FIRST_VALUE和DESC顺序选择最后一个日期;)(尝试使用LAST_VALUE,看看会发生什么......它将无法按照您想要的方式工作;)) 此外,此时,我们使用“rn”列对每个组的行进行编号,以便稍后我们可以选择第一个。
最终查询只会选出每个“组”的第一行..给你最终结果。