我有下表
CREATE TABLE PERSONS (
PERSON_UID NUMBER PRIMARY KEY,
PERSON_NAME VARCHAR2(100)
);
CREATE TABLE SKILLS (
SKILL_UID NUMBER PRIMARY KEY,
SKILL_NAME VARCHAR2(100)
);
CREATE TABLE PERSON_SKILLS (
PERSON_SKILLS_UID NUMBER,
PERSON_FK NUMBER,
SKILL_FK NUMBER,
VALID_START DATE,
VAID_END DATE
);
表格数据:
PERSON_UID | PERSON_NAME ---------: | :---------- 1 | P1 2 | P2 3 | P3
SKILL_UID | SKILL_NAME --------: | :--------- 1 | SKILL1 2 | SKILL2 3 | SKILL3 4 | SKILL4 5 | SKILL5 6 | SKILL6 7 | SKILL7 8 | SKILL8 9 | SKILL9 10 | SKILL10
PERSON_SKILLS_UID | PERSON_FK | SKILL_FK | VALID_START | VAID_END ----------------: | --------: | -------: | :---------- | :---------- 1 | 1 | 1 | 01-JAN-1990 | null 2 | 1 | 2 | 01-JAN-1990 | 25-SEP-2001 4 | 1 | 6 | 01-JAN-1990 | 01-JAN-2010 5 | 1 | 7 | 01-JAN-1990 | null 3 | 1 | 3 | 01-JUL-1990 | null 6 | 1 | 9 | 31-DEC-2018 | null 7 | 2 | 2 | 01-JAN-1990 | null 9 | 2 | 8 | 01-JAN-1990 | 01-JAN-2001 8 | 2 | 3 | 01-JAN-1995 | 20-OCT-1998 10 | 3 | 9 | 01-JAN-1990 | null 11 | 3 | 4 | 01-JAN-1990 | null 12 | 3 | 5 | 01-JAN-1991 | null 13 | 3 | 7 | 01-JAN-2005 | null
表PERSON_SKILLS包含具有有效开始日期和有效结束日期的人员的个人技能。 (有效结束日期为NULL,表示该技能当前处于活动状态)
我想使用开始/结束日期以及与该员工间隔有关的所有技能(以逗号分隔)来创建日期间隔。
让我们以第二个人为例:(我需要在单个查询中为所有员工提供输出)
PERSON_NAME | VALID_START | VALID_END | SKILLS_OF_EMP :---------- | :---------- | :---------- | :--------------------- P2 | 01-JAN-1990 | 31-DEC-1994 | SKILL2, SKILL8 P2 | 01-JAN-1995 | 20-OCT-1998 | SKILL2, SKILL3, SKILL8 P2 | 21-OCT-1998 | 01-JAN-2001 | SKILL2, SKILL8 P2 | 02-JAN-2001 | 31-DEC-4712 | SKILL2
我已经用所有表DDL,数据以及预期的输出创建了db<>fiddle。
希望找到性能更快的查询,因为我大约有18000人,平均每人具有15-16技能。
注意:4712年12月31日是时间的结束。
答案 0 :(得分:1)
with ranges as (
select per, dt d1, nvl(lead(dt) over (partition by per order by dt) - 1, date '4712-12-31') d2
from (select person_fk per, valid_start dt from person_skills union
select person_fk, vaid_end from person_skills)
where dt is not null)
select per, d1, d2 , listagg(skill_name, ', ') within group (order by d1) list
from person_skills ps
join ranges r on (d1<vaid_end or vaid_end is null) and valid_start <= d2 and ps.person_fk = per
join persons p on per = p.person_uid
join skills s on s.skill_uid = ps.skill_fk
where d1 is not null
group by per, d1, d2
主要问题是为每个人创建时间范围。我为每个人合并了date_start和date_end(不是union all
,因为我们需要不同的值)。在lead()
中对这些日期进行了排序以创建期间。
这种准备好的表可以用典型的方式与您的数据连接,聚合并listagg()
完成工作。
答案 1 :(得分:1)
使用UNPIVOT INCLUDE NULLS
将日期范围的开始和结束分别放在不同的行中,然后使用LEAD
分析函数为每个人查找连续的边界日期,然后可以重新加入主表和聚合。
查询:
SELECT p.person_name,
r.range_start AS valid_start,
r.range_end AS valid_end,
LISTAGG( s.skill_name, ',' ) WITHIN GROUP ( ORDER BY s.skill_name ) AS skills_of_emp
FROM (
SELECT PERSON_FK,
date_time AS range_start,
LEAD( date_time ) OVER ( PARTITION BY PERSON_FK ORDER BY date_time )
AS range_end
FROM (
SELECT DISTINCT
PERSON_FK,
COALESCE( date_time, DATE '4712-12-31' ) AS date_time
FROM person_skills
UNPIVOT INCLUDE NULLS ( date_time FOR value IN ( valid_start AS 1, valid_end AS -1 ) )
)
) r
INNER JOIN Person_skills ps
ON ( ps.valid_start <= r.range_start
AND r.range_end <= COALESCE( ps.valid_end, DATE '4712-12-31' )
AND ps.person_fk = r.person_fk )
INNER JOIN skills s
ON ( ps.skill_fk = s.skill_uid )
INNER JOIN people p
ON ( ps.person_fk = p.person_uid )
GROUP BY r.person_fk,
p.person_name,
r.range_start,
r.range_end
输出:
PERSON_NAME | VALID_START | VALID_END | SKILLS_OF_EMP :---------- | :---------- | :--------- | :--------------------------------- P1 | 1990-01-01 | 1990-07-01 | SKILL1,SKILL2,SKILL6,SKILL7 P1 | 1990-07-01 | 2001-09-25 | SKILL1,SKILL2,SKILL3,SKILL6,SKILL7 P1 | 2001-09-25 | 2010-01-01 | SKILL1,SKILL3,SKILL6,SKILL7 P1 | 2010-01-01 | 2018-12-31 | SKILL1,SKILL3,SKILL7 P1 | 2018-12-31 | 4712-12-31 | SKILL1,SKILL3,SKILL7,SKILL9 P2 | 1990-01-01 | 1995-01-01 | SKILL2,SKILL8 P2 | 1995-01-01 | 1998-10-20 | SKILL2,SKILL3,SKILL8 P2 | 1998-10-20 | 2001-01-01 | SKILL2,SKILL8 P2 | 2001-01-01 | 4712-12-31 | SKILL2 P3 | 1990-01-01 | 1991-01-01 | SKILL4,SKILL9 P3 | 1991-01-01 | 2005-01-01 | SKILL4,SKILL5,SKILL9 P3 | 2005-01-01 | 4712-12-31 | SKILL4,SKILL5,SKILL7,SKILL9
db <>提琴here