从非规范化表中规范化数据

时间:2016-08-05 16:59:52

标签: sql sql-server

我的表格中有数据

TypeError: replace() takes at least 2 arguments (1 given)

我正在寻找像这样的输出,

RepID|Role|Status|StartDate |EndDate   |
-----|----|------|----------|----------|  
10001|R1  |Active|01/01/2015|01/31/2015|
-----|----|------|----------|----------|
10001|R1  |Leavee|02/01/2015|02/12/2015|
-----|----|------|----------|----------|
10001|R1  |Active|02/13/2015|02/28/2015|
-----|----|------|----------|----------|
10001|R2  |Active|03/01/2015|03/18/2015|
-----|----|------|----------|----------|
10001|R2  |Leave |03/19/2015|04/10/2015|
-----|----|------|----------|----------|
10001|R2  |Active|04/11/2015|05/10/2015|
-----|----|------|----------|----------|
10001|R1  |Active|05/11/2015|06/13/2015|
-----|----|------|----------|----------|
10001|R1  |Leave |06/14/2015|12/31/9998|
-----|----|------|----------|----------|

每当只发生角色更改时,我需要捕获start和EndDate。我尝试了不同的方法但无法获得输出。

感谢任何帮助。

下面是我试过的SQL,但它没有帮助,

RepID|Role|StartDate |EndDate   |   
-----|----|----------|----------|
10001|R1  |01/01/2015|02/28/2015|
-----|----|----------|----------|  
10001|R2  |03/01/2015|05/10/2015|
-----|----|----------|----------|  
10001|R1  |05/11/2015|12/31/9998|
-----|----|----------|----------|

2 个答案:

答案 0 :(得分:2)

这里的关键是识别连续的行,直到角色发生变化。这可以通过使用lead函数比较下一行的角色和一些额外的逻辑来将所有先前的行分类到同一组中来完成。

将它们分组后,您只需使用minmax来获取开始日期和结束日期。

with groups as (
select x.*
,case when grp = 1 then 0 else 1 end + sum(grp) over(partition by repid order by startdate) grps
from (select t.*
      ,case when lead(role) over(partition by repid order by startdate) = role then 0 else 1 end grp
      from t) x
)
select distinct repid,role
,min(startdate) over(partition by repid,grps) startdt
,max(enddate) over(partition by repid,grps) enddt
from groups
order by 1,3

Sample demo

答案 1 :(得分:0)

您是否只想要每个repID和角色的最小(开始)/最大(结束)日期? 如果是这样,请尝试:

Select
  repID, role,
  min(starDate),
  max(endDate)
from
  tbl
group by 
  repID, role

- 一个更详细的解决方案,相当于VKP:

SELECT
    repid, ROLE, grpID, 
    MIN(startdate) AS min_startDateOverRole, 
    MAX(endDate) AS max_endDateOverRole  
FROM 
    (SELECT 
        *, CASE WHEN isGrpEnd = 1 THEN 0 ELSE 1 end + 
        -- when on group end row, don't increment grpID.  
        -- Wait until start of next group
        SUM(isGrpEnd) OVER(ORDER BY startdate) grpID  
        -- sum(all group end rows up to this one)
     FROM 
            (SELECT 
                *,
               CASE WHEN lead(ROLE) OVER(ORDER BY startdate) = ROLE 
                         THEN 0 ELSE 1 end isGrpEnd
             FROM t) x  )
GROUP BY 
  repid, ROLE, grpid
ORDER BY 
  1,3