我正在尝试构建SCD Type-2雇员-经理关系表。我已经设置了基本表:
| emp_id | manager_id | is_emp_self_managed | date_effective | date_expired |
|--------|------------|---------------------|----------------|--------------|
| 2 | | TRUE | 2004-04-01 | 2013-02-01 |
| 2 | 10 | FALSE | 2013-02-01 | 2019-04-01 |
| 5 | 2 | FALSE | 2005-12-01 | 2013-04-11 |
| 10 | | TRUE | 2013-02-01 | 2019-04-01 |
根据这些数据,我想为is_manager_self_managed
添加一个附加的自引用列。当我进行自我联接时,我得到了这一点(出于说明目的,使用daterange作为日期列):
| emp_id | is_emp_self_managed | manager_id | is_manager_self_managed | emp_range | man_range |
|--------|---------------------|------------|-------------------------|---------------------------|-------------------------|
| 2 | TRUE | | TRUE | [2004-04-01,2013-02-01) | [2004-04-01,2013-02-01) |
| 2 | FALSE | 10 | TRUE | [2013-02-01,2019-04-01) | [2013-02-01,2019-04-01) |
| 5 | FALSE | 2 | TRUE | *[2005-12-01,2013-04-11)* | [2004-04-01,2013-02-01) |
| 5 | FALSE | 2 | FALSE | *[2005-12-01,2013-04-11)* | [2013-02-01,2019-04-01) |
| 10 | TRUE | | TRUE | [2013-02-01,2019-04-01) | [2013-02-01,2019-04-01) |
跨日期范围的自我联接会导致emp_id = 5
由于manager_id = 2
从自我管理切换为非自我管理而获得了额外的一行。但是,我现在必须解决返回的日期范围冲突。最终,emp_id = 5
将以其自己的有效日期范围开始和结束,但是引入的更改将需要合并到新的更新日期范围中。
查询以产生合并的输出:
with emp_data as (
select *
from (
values(2,'2004-04-01'::date,'2013-02-01'::date,true,null)
,(2,'2013-02-01'::date,'2019-04-01'::date,false,10)
,(5,'2005-12-01'::date,'2013-04-11'::date,false,2)
,(10,'2013-02-01'::date,'2019-04-01'::date,true,null)
)t(emp_id, date_effective, date_expired, is_emp_self_managed, manager_id)
)
select t1.emp_id
,t1.is_emp_self_managed
,t1.manager_id
,t2.is_emp_self_managed as is_manager_self_managed
,daterange(t1.date_effective, t1.date_expired) as emp_range
,daterange(t2.date_effective, t2.date_expired) as man_range
from emp_data t1
left join emp_data t2 on coalesce(t1.manager_id, t1.emp_id) = t2.emp_id
and ((t1.date_effective >= t2.date_effective and t1.date_effective < t2.date_expired)
or (t2.date_effective >= t1.date_effective and t2.date_effective < t1.date_expired))
order by t1.emp_id, t1.date_effective, t2.date_effective
理想的输出如下所示:
| emp_id | is_emp_self_managed | manager_id | is_manager_self_managed | date_effective | date_expired |
|--------|---------------------|------------|-------------------------|----------------|--------------|
| 2 | TRUE | | TRUE | 2004-04-01 | 2013-02-01 |
| 2 | FALSE | 10 | TRUE | 2013-02-01 | 2019-04-01 |
| 5 | FALSE | 2 | TRUE | *2005-12-01* | *2013-02-01* |
| 5 | FALSE | 2 | FALSE | *2013-02-01* | *2013-04-11* |
| 10 | TRUE | | TRUE | 2013-02-01 | 2019-04-01 |
答案 0 :(得分:0)
我刚刚意识到这可能有效:
with emp_data as (
select *
from (
values(2,'2004-04-01'::date,'2013-02-01'::date,true,null)
,(2,'2013-02-01'::date,'2019-04-01'::date,false,10)
,(5,'2005-12-01'::date,'2013-04-11'::date,false,2)
,(10,'2013-02-01'::date,'2019-04-01'::date,true,null)
)t(emp_id, date_effective, date_expired, is_emp_self_managed, manager_id)
)
select t1.emp_id
,t1.is_emp_self_managed
,t1.manager_id
,t2.is_emp_self_managed as is_manager_self_managed
--Added case statements
,case
when t1.date_effective <@ daterange(t2.date_effective, t2.date_expired)
and not t2.date_effective <@ daterange(t1.date_effective, t1.date_expired)
then t1.date_effective
else t2.date_effective
end as date_effective
,case
when t1.date_expired <@ daterange(t2.date_effective, t2.date_expired)
and not t2.date_expired <@ daterange(t1.date_effective, t1.date_expired)
then t1.date_expired
else t2.date_expired
end as date_expired
,daterange(t1.date_effective, t1.date_expired) as emp_range
,daterange(t2.date_effective, t2.date_expired) as man_range
from emp_data t1
left join emp_data t2 on coalesce(t1.manager_id, t1.emp_id) = t2.emp_id
and ((t1.date_effective >= t2.date_effective and t1.date_effective < t2.date_expired)
or (t2.date_effective >= t1.date_effective and t2.date_effective < t1.date_expired))
order by t1.emp_id, t1.date_effective, t2.date_effective