我的情况是我有一个交易表,包含开始和结束日期。问题是这些交易日期经常相互重叠,我想将这些场景组合在一起。
例如,在下面的情况中,交易#1是" root" 交易,而#2-#4与#1和/或彼此重叠。但是,事务#5与任何事物都不重叠,因此它是一个新的" root"事务。
+----------------+-----------+-----------+----------------------------------+
| Transaction ID | StartDate | EndDate | |
+----------------+-----------+-----------+----------------------------------+
| 1 | 1/1/2017 | 1/3/2017 | root transaction |
| 2 | 1/2/2017 | 1/6/2017 | overlaps with #1 |
| 3 | 1/5/2017 | 1/10/2017 | overlaps with #2 |
| 4 | 1/3/2017 | 1/13/2017 | overlaps with #2 and #3 |
| 5 | 1/15/2017 | 1/20/2017 | no overlap, new root transaction |
+----------------+-----------+-----------+----------------------------------+
以下是我希望输出的外观。我想
+----------------+-----------+-----------+------------------+------+
| Transaction ID | Start | End | Root Transaction | Rank |
+----------------+-----------+-----------+------------------+------+
| 1 | 1/1/2017 | 1/3/2017 | 1 | 1 |
| 2 | 1/2/2017 | 1/6/2017 | 1 | 2 |
| 3 | 1/5/2017 | 1/10/2017 | 1 | 3 |
| 4 | 1/3/2017 | 1/13/2017 | 1 | 4 |
| 5 | 1/15/2017 | 1/20/2017 | 5 | 1 |
+----------------+-----------+-----------+------------------+------+
我如何在SQL中解决这个问题?
答案 0 :(得分:3)
以下是使用OUTER APPLY
Declare @YourTable table ([Transaction ID] int,StartDate date,EndDate date)
Insert Into @YourTable values
(1,'1/1/2017','1/3/2017'),
(2,'1/2/2017','1/6/2017'),
(3,'1/5/2017','1/10/2017'),
(4,'1/3/2017','1/13/2017'),
(5,'1/15/2017','1/20/2017')
Select [Transaction ID]
,[Start] = StartDate
,[End] = EndDate
,[Root Transaction]=Grp
,[Rank] = Row_Number() over (Partition By Grp Order by [Transaction ID])
From (
Select A.*
,Grp = max(Flag*[Transaction ID]) over (Order By [Transaction ID])
From (
Select A.*,Flag = IsNull(B.Flg,1)
From @YourTable A
Outer Apply (
Select Top 1 Flg=0
From @YourTable
Where (StartDate between A.StartDate and A.EndDate
or EndDate between A.StartDate and A.EndDate )
and [Transaction ID]<A.[Transaction ID]
) B
) A
) A
返回
编辑 - 一些评论
在OUTER APPLY
中,Flag将设置为1或0. 1表示新组。 0表示记录将与现有范围重叠
然后下一个查询&#34; up&#34;,我们使用窗口函数来应用Grp代码(Flag * Trans ID)。请记住,新组为1,现有为0 现在,窗口函数将占用此产品的最大值,因为它遍历事务。
最后的查询只是使用Grp的窗口函数分区来应用Rank,按Trans ID排序
如果它有助于可视化:
第一个子查询(外部申请)生成
第二个子查询生成
答案 1 :(得分:1)
这是“差距和岛屿”的一个例子。对于您的数据,您可以通过确定每个开始的位置来确定“孤岛” - 也就是说,记录与前一个记录不重叠的位置。然后,您可以使用row_number()
获得排名。
所以,这是一个方法:
select t.*,
min(transactionId) over (partition by island) as start,
row_number() over (partition by island order by endDate) as rnk
from (select t.*,
sum(startIslandFlag) over (order by startDate) as island
from (select t.*,
(case when not exists (select 1
from t t2
where t2.startdate < t.startdate and
t2.enddate >= t.startdate
)
then 1 else 0
end) as startIslandFlag
from t
) t
) t;
注意:
range
窗口)。答案 2 :(得分:1)
识别根事务:
with roots as (
select *
from tran as t1
where not exists (
select 1
from tran as t2
where t2.Transaction_ID < t1.Transaction_ID
and (
t1.StartDate between t2.StartDate and t2.EndDate
or
t1.EndDate between t2.StartDate and t2.EndDate
)
)
)
创建一个双根系统来捕获它们之间的所有重叠
select t.Transaction_ID,
t.StartDate as [Start],
t.EndDate as [End],
r1.Transaction_ID as Root_Transaction,
row_number() over (partition by r1.Transaction_ID order by t.EndDate) as [Rank]
from roots as r1
inner join roots as r2
on r2.Transaction_ID > r1.Transaction_ID
inner join tran as t
on t.Transaction_ID >= r1.Transaction_ID
and t.Transaction_ID < r2.Transaction_ID
where not exists ( --this "not exists" makes sure r1 and r2 are consequetive roots
select 1
from roots as r3
where r3.Transaction_ID > r1.Transaction_ID
and r3.Transaction_ID < r2.Transaction_ID
)