将交易与日期范围行配对

时间:2018-12-19 15:47:26

标签: sql sql-server tsql sql-server-2014

我有一个如下表所示的表,该表显示了何时将员工作为特定角色从帐户中添加(Operation = I)或从中移除(Operation = D)

Account | Employee | Role | Operation | OperationTimestamp
ABC     | 1        | Rep  | I         | 1/1/2018
DEF     | 1        | Mgr  | I         | 1/1/2018
ABC     | 1        | Rep  | D         | 3/31/2018
ABC     | 1        | Rep  | I         | 7/1/2018
ABC     | 1        | Rep  | D         | 12/31/2018
ABC     | 2        | Mgr  | I         | 1/1/2018
DEF     | 2        | Exc  | I         | 1/1/2018
ABC     | 2        | Mgr  | D         | 3/31/2018
ABC     | 2        | Mgr  | I         | 6/1/2018
ABC     | 2        | Mgr  | D         | 10/31/2018

(I =插入,D =删除)

我需要开发一个查询,该查询将返回员工在该帐户上的帐户,员工,角色和日期范围,如下所示:

Account | Employee | Role | StartingDate | EndingDate
ABC     | 1        | Rep  | 1/1/2018     | 3/31/2018
DEF     | 1        | Mgr  | 1/1/2018     | NULL
ABC     | 1        | Rep  | 7/1/2018     | 12/31/2018
ABC     | 2        | Mgr  | 1/1/2018     | 3/31/2018
DEF     | 2        | Exc  | 1/1/2018     | NULL
ABC     | 2        | Mgr  | 6/1/2018     | 10/31/2018

因此,从结果集中可以看到,如果某个雇员已添加到一个帐户中,但尚未被删除,则EndingDate应该为NULL

我在努力挣扎的是,您可以多次从同一帐户中添加和/或从同一帐户中删除同一名员工,并且/或者您可以担任多个角色。我的胆量说,我需要按帐户>员工>角色>日期对交易进行排序,并以某种方式将每2行组合在一起(因为它应该始终是I操作,然后是D操作),但是我不确定如何处理如果他们仍在帐户中,则“丢失”会删除。

3 个答案:

答案 0 :(得分:1)

假设:对于相同的组合(帐户,员工,角色),I操作绝不会再执行另一个I;如果存在下一行(可能不适用于该组合),则始终为D

数据:

create table my_table (
  Account varchar(3), 
  Employee int, 
  role varchar(3),
  Operation varchar(1),
  OperationTimestamp datetime
);

insert into my_table values
 ('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');

如果以上内容正确,那么我将使用以下查询:

with
x as (
  select
    account, employee, role, operationtimestamp, operation,
    lead(operation) 
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_op,
    lead(operationtimestamp)
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_ts
  from my_table
),
y as(
  select
    account, employee, role,
    operationtimestamp as startingdate,
    next_ts as endingdate
  from x
  where operation = 'I'
)
select *
from y
order by employee, startingdate

结果:

account  employee  role  startingdate           endingdate           
-------  --------  ----  ---------------------  ---------------------
ABC      1         Rep   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      1         Mgr   2018-01-01 00:00:00.0  <null>               
ABC      1         Rep   2018-07-01 00:00:00.0  2018-12-31 00:00:00.0
ABC      2         Mgr   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      2         Exc   2018-01-01 00:00:00.0  <null>               
ABC      2         Mgr   2018-06-01 00:00:00.0  2018-10-31 00:00:00.0

答案 1 :(得分:1)

有了一个row_number和一个自我join,这非常简单:

declare @t table(Account varchar(3), Employee int, EmpRole varchar(3), Operation varchar(1), OperationTimestamp datetime);
insert into @t values
 ('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');

with d as
(
    select Account
            ,Employee
            ,EmpRole
            ,Operation
            ,OperationTimestamp
            ,row_number() over (partition by Account, Employee, EmpRole order by OperationTimestamp) as ord
    from @t
)
select s.Account
    ,s.Employee
    ,s.EmpRole
    ,s.OperationTimestamp as OperationTimestampStart
    ,e.OperationTimestamp as OperationTimestampEnd
from d as s
    left join d as e
        on s.Account = e.Account
            and s.Employee = e.Employee
            and s.EmpRole = e.EmpRole
            and s.ord = e.ord-1
where s.Operation = 'I';

输出

+---------+----------+---------+-------------------------+-----------------------+
| Account | Employee | EmpRole | OperationTimestampStart | OperationTimestampEnd |
+---------+----------+---------+-------------------------+-----------------------+
| ABC     |        1 | Rep     | 2018-01-01              | 2018-03-31            |
| ABC     |        1 | Rep     | 2018-07-01              | 2018-12-31            |
| ABC     |        2 | Mgr     | 2018-01-01              | 2018-03-31            |
| ABC     |        2 | Mgr     | 2018-06-01              | 2018-10-31            |
| DEF     |        1 | Mgr     | 2018-01-01              | NULL                  |
| DEF     |        2 | Exc     | 2018-01-01              | NULL                  |
+---------+----------+---------+-------------------------+-----------------------+

答案 2 :(得分:0)

我认为您只需要lead()或累积min()。这是我的意思:

select account, employee, role, OperationTimestamp, EndingDate
from (select t.*,
             min(case when operation = 'D' then OperationTimestamp end) over
                 (partition by account, employee, role
                  order by OperationTimestamp desc
                 ) as EndingDate
      from t
     ) t
where operation = 'I';