目前我正在尝试找出历史表之间的连接,我希望同步两个时间轴。 举个例子,我有以下两个表:
A
ID Value FROM TO
1 5 01.01.2018 31.03.2018
1 6 31.03.2018 08.04.2018
B A_FK Value FROM TO
1 1 50 01.02.2018 01.04.2018
2 1 51 04.04.2018 10.04.2018
作为基线,我想采用表A的时间线并连接表B,包括NULL值,以便我知道,哪些时候没有拟合值。 期望的结果应如下所示:
C
Value_A Value_B FROM TO
5 NULL 01.01.2018 01.02.2018
5 50 01.02.2018 31.03.2018
6 50 31.03.2018 01.04.2018
6 NULL 01.04.2018 04.04.2018
6 51 04.04.2018 08.04.2018
你能帮我解决这个问题吗?我开始了,但可能无法调整错误的历史记录 - 这是我的尝试:
with a as (SELECT *
FROM (VALUES (1,5,'01.01.2018','31.03.2018')
, (1,6,'31.03.2018','08.04.2018')
) A (ID, VALUE, FROM, TO)),
b as (
SELECT *
FROM (VALUES (1,1,50,'01.02.2018','01.04.2018')
, (2,1,51,'04.04.2018','10.04.2018')
) A (ID,A_FK, VALUE, FROM, TO)
)
select
a.value as value_a,
b.value as value_b,
max(a.from,b.from) as from,
min(a.to,b.to) as to
from a
left outer join b on
a.id = b.a_fk and
a.from < b.to and
a.to > b.from;
正如你所看到的,它是对齐的,但不是我预期的方式。
感谢您的帮助。
答案 0 :(得分:1)
正如我在评论中提到的,我在另一个问题中使用我自己的answer中的技巧,你可以解决问题。
这是一个解决方案。
测试数据:
create table a (
id integer,
value integer,
dtfrom date,
dtto date
);
create table b(
id integer,
a_fk integer,
value integer,
dtfrom date,
dtto date
);
insert into a values
(1, 5, '2018-01-01', '2018-03-31'),
(1, 6, '2018-03-31', '2018-04-08');
insert into b values
(1, 1, 50, '2018-02-01', '2018-04-01'),
(2, 1, 51, '2018-04-04', '2018-04-10');
此解决方案的技巧部分是生成任何表格中的日期间隔,例如01.01.2018-01.02.2018
和01.02.2018-31.03.2018
,因此为了做到这一点,您必须拥有所有可用的日期间隔。将日期作为一个表格,因此我创建了一个名为 timmings 的VIEW,以便更轻松:
create or replace view timmings as
select a.dtfrom dt from a inner join b on a.id=b.a_fk
union
select a.dtto from a inner join b on a.id=b.a_fk
union
select b.dtfrom from a inner join b on a.id=b.a_fk
union
select b.dtto from a inner join b on a.id=b.a_fk;
之后,您需要一个查询来生成所有可用的句点(开始和结束),因此它将是:
select t1.dt as start,
(select min(t2.dt)
from timmings t2
where t2.dt>t1.dt) as dend
from timmings t1
order by start;
这将导致(包含您的样本数据):
start dend
01/01/2018 01/02/2018
01/02/2018 31/03/2018
31/03/2018 01/04/2018
01/04/2018 04/04/2018
04/04/2018 08/04/2018
08/04/2018 10/04/2018
10/04/2018 null
使用它可以使用它来获取表a
中与句点相交的所有可用值:
select a.id, a.value, tm.start, tm.dend
from (select t1.dt as start,
(select min(t2.dt)
from timmings t2
where t2.dt>t1.dt) as dend
from timmings t1) tm
left join a on tm.start >= a.dtfrom and tm.dend <= a.dtto
where a.id is not null
order by tm.start;
结果是:
id value start end
1 5 01/01/2018 01/02/2018
1 5 01/02/2018 31/03/2018
1 6 31/03/2018 01/04/2018
1 6 01/04/2018 04/04/2018
1 6 04/04/2018 08/04/2018
最后你LEFT JOIN
与b
表:
select x.value as valueA,
b.value as valueB,
x.start as "from",
x.dend as "to"
from (select a.id, a.value, tm.start, tm.dend
from (select t1.dt as start,
(select min(t2.dt)
from timmings t2
where t2.dt>t1.dt) as dend
from timmings t1) tm
left join a on tm.start >= a.dtfrom and tm.dend <= a.dtto
where a.id is not null
) x
left join b on b.a_fk = x.id
and b.dtfrom <= x.start
and b.dtto >= x.dend
order by x.start;
这将为您提供所需的结果:
valueA valueB start end
5 null 01/01/2018 01/02/2018
5 50 01/02/2018 31/03/2018
6 50 31/03/2018 01/04/2018
6 null 01/04/2018 04/04/2018
6 51 04/04/2018 08/04/2018
请参阅最终解决方案:http://sqlfiddle.com/#!9/36418e/1它是MySQL,但由于它是所有SQL ANSI,它在DB2中都可以正常工作
答案 1 :(得分:0)
有一篇很棒的博客文章 约翰·马恩帕的“Fun with Date Ranges”
其次,如果您有机会影响DDL,我建议您仔细查看Db2时态表 - 它们提供完整的SQL支持(Time Travel SQL) - 查找详细信息here
答案 2 :(得分:0)
如果您拥有所谓的日历表 - 包含每个日期的表 - 这实际上非常简单 - 尽管您可以根据需要即时构建一个表。您可以使用它将这更明显地转变为gaps-and-islands问题 (无论如何,你想要一个,因为它们是最有用的分析维度表之一):
SELECT valueA, valueB,
MIN(calendarDate) AS startDate,
MAX(calendarDate) + 1 DAY AS endDate
FROM (SELECT A.val AS valueA, B.val AS valueB, Calendar.calendarDate,
ROW_NUMBER() OVER(ORDER BY Calendar.calendarDate) -
ROW_NUMBER() OVER(PARTITION BY A.val, B.val ORDER BY Calendar.calendarDate) AS grouping
FROM Calendar
LEFT JOIN A
ON A.startDate <= Calendar.calendarDate
AND A.endDate > Calendar.calendarDate
LEFT JOIN B
ON B.startDate <= Calendar.calendarDate
AND B.endDate > Calendar.calendarDate
WHERE A.val IS NOT NULL
OR B.val IS NOT NULL) Groups
GROUP BY valueA, valueB, grouping
ORDER BY grouping
SQL Fiddle Example (示例中对SQL Server使用的小调整)
...产生以下结果。请注意,表B中的日期范围有几天没有出现在表A中!
valueA valueB startDate endDate
5 (null) 2018-01-01 2018-02-01
5 50 2018-02-01 2018-03-31
6 50 2018-03-31 2018-04-01
6 (null) 2018-04-01 2018-04-04
6 51 2018-04-04 2018-04-08
(null) 51 2018-04-08 2018-04-10
(这当然可以通过将连接切换到常规INNER JOIN
来轻易改变,但我认为这个和其他情况很重要。)