我正在尝试创建一个数据集,其中所有的Starts到最近的Stop。 问题在于,第一次启动和下一次停止之间的启动次数会有所不同。
源数据集如下:
RowNum Timestamp Action
==============================
1 01/01/18 12:00 Start
2 01/01/18 01:00 Start
3 01/01/18 02:00 Stop
4 01/01/18 03:00 Start
5 01/01/18 05:00 Stop
6 01/01/18 13:00 Start
7 01/01/18 15:00 Start
8 01/01/18 17:00 Start
9 01/01/18 21:00 Stop
我希望我的最终结果是这样的:
Start Stop
================================
01/01/18 12:00 01/01/18 02:00
01/01/18 03:00 01/01/18 05:00
01/01/18 13:00 01/01/18 21:00
或者即使有一个记录,每个起点到最近的终点也将是很好的记录。
非常感谢您提供任何指导。
答案 0 :(得分:2)
您可以利用累积最小值查找下一个“停止”事件:
select *,
min(case when Action = 'Stop' then Timestamp end) -- next Stop
over (--partition by ???
order by Timestamp
rows between current row and unbounded following) as Stop
from tab
基于此,它是一个简单的聚合:
with cte as
( select *,
min(case when Action = 'Stop' then Timestamp end) -- next Stop
over (--partition by ???
order by Timestamp
rows between current row and unbounded following) as Stop
from tab
)
select
min(Timestamp) as start,
Stop
from cte
group by Stop
order by 1
答案 1 :(得分:1)
首先在表中的每个起点都停下来
select
A.Timestamp as Start
min(B.Timestamp) as Stop
from
mytable A
left join
mytable B
on A.Action = 'Start'
and B.Action = 'Stop'
and A.Timestamp < B.Timestamp
group by A.Timestamp
然后您可以使用该结果(以下查询中的别名表R1 int)获取最终表
select
min(Start),
Stop
from
(
select
A.Timestamp as Start
min(B.Timestamp) as Stop
from
mytable A
left join
mytable B
on A.Action = 'Start'
and B.Action = 'Stop'
and A.Timestamp < B.Timestamp
group by A.Timestamp
) as R1
group by Stop
答案 2 :(得分:1)
解决方案:
方案A是您希望看到所有开始的站点的情况:找到在开始之后的最早站点。
方案B是您只想看到最早/最新开始的止损的方法:首先从Senario A的结果中获取一个集合,然后查找在Stop之前的最早/最新的起始。
此解决方案不考虑何时存在重复项,而您想将其保留在结果中-这将涉及第三个字段,如RowNum。
一种可能的实现:
DECLARE @Table TABLE (
Timestamp DATETIME,
Action VARCHAR(5)
)
INSERT @Table
VALUES
('01/01/18 12:00', 'Start'),
('01/01/18 01:00', 'Start'),
('01/01/18 02:00', 'Stop'),
('01/01/18 03:00', 'Start'),
('01/01/18 05:00', 'Stop'),
('01/01/18 13:00', 'Start'),
('01/01/18 15:00', 'Start'),
('01/01/18 17:00', 'Start'),
('01/01/18 21:00', 'Stop'),
('01/01/18 22:00', 'Start')
SELECT * FROM @Table WHERE Action = 'Start' ORDER BY Timestamp
SELECT * FROM @Table WHERE Action = 'Stop' ORDER BY Timestamp
-- Scenario A:
SELECT Starts.Timestamp as Start, MIN (Stops.Timestamp) as Stop
FROM
(SELECT * FROM @Table WHERE Action = 'Start') as Starts
LEFT OUTER JOIN
(SELECT * FROM @Table WHERE Action = 'Stop') as Stops
on Stops.Timestamp >= Starts.Timestamp
GROUP BY Starts.Timestamp
ORDER BY Starts.Timestamp
-- Scenario B:
-- same block as above with a temp table to hold the results
SELECT Starts.Timestamp as Start, MIN (Stops.Timestamp) as Stop
INTO #allstops
FROM
(SELECT * FROM @Table WHERE Action = 'Start') as Starts
LEFT OUTER JOIN
(SELECT * FROM @Table WHERE Action = 'Stop') as Stops
on Stops.Timestamp >= Starts.Timestamp
GROUP BY Starts.Timestamp
SELECT allstops.Start, LatestStart.Stop
FROM #allstops as allstops
LEFT OUTER JOIN (
SELECT MIN (Start) as Start, Stop -- this returns the earliest Start, switch to MAX to get the latest
FROM #allstops
GROUP BY Stop
) as LatestStart
on allstops.Start = LatestStart.Start
答案 3 :(得分:1)
您可以将lag
和ceiling
函数用作:
select max(Start) as Start, max(Stop) as Stop
from
(
select row_number() over ( order by rownum ) as rn,
( case when action = 'Start' then q.timestamp end ) as Start,
( case when action = 'Stop' then q.timestamp end ) as Stop
from
(
select t.*,
lag(action) over (order by rownum) as lg
from tab t
) q
where q.action != coalesce(lg,'Stop')
) r
group by ceiling(rn*.5)
order by ceiling(rn*.5);
Start Stop
01.01.2018 12:00:00 01.01.2018 02:00:00
01.01.2018 03:00:00 01.01.2018 05:00:00
01.01.2018 13:00:00 01.01.2018 21:00:00
P.S。对于每个二进制步骤对,我们确定START
和STOP
行,其中行non-null
,而行对的另一成员为null
。由于这种逻辑,我需要mod(...,2)
作为行号,并满足ceil(rn*.5)
的要求,即对于1
或1
都产生2
,对于2
3
和4
,以及{{1}分别为3
和5
的{{1}},即最近的连续高位整数。