INPUT:
+--------+---------+------+------+------+
| EMP_ID | DT_1 | ID_1 | ID_2 | ID_3 |
+--------+---------+------+------+------+
| 408 | 4/7/17 | 906 | | 906 |
| 408 | 4/8/17 | 906 | 799 | 906 |
| 408 | 4/9/17 | 906 | 799 | 906 |
| 408 | 4/10/17 | 906 | 600 | 906 |
| 408 | 4/11/17 | 906 | | 906 |
| 408 | 4/12/17 | 906 | | 906 |
| 408 | 4/13/17 | 906 | 799 | 906 |
| 408 | 4/14/17 | 906 | 799 | 906 |
+--------+---------+------+------+------+
我需要输出结果为:
+--------+---------+---------+------+------+------+
| EMP_ID | FROM_DT | TO_DT | ID_1 | ID_2 | ID_3 |
+--------+---------+---------+------+------+------+
| 408 | 4/8/17 | 4/9/17 | 906 | 799 | 906 |
| 408 | 4/10/17 | 4/10/17 | 906 | 600 | 906 |
| 408 | 4/13/17 | 4/14/17 | 906 | 799 | 906 |
+--------+---------+---------+------+------+------+
不匹配(ID1<> ID2或ID1<> ID3)应该具有确切的from和to日期,留下没有不匹配的间隙。
答案 0 :(得分:0)
这似乎是间隙和岛屿的变种:
select emp_id, id_1, id_2, id_3, min(dt_1), max(dt_1)
from (select t.*,
row_number() over (partition by emp_id, id_1, id_2, id_3 order by dt_1) as seqnum
from t
where id_2 is not null
) t
group by emp_id, id_1, id_2, id_3, (dt_1 - seqnum);