如何分组连续的行?

时间:2016-05-26 18:10:42

标签: sql sql-server sql-server-2008 sqlfiddle

所以,我有一个像这样的行的表:

Ev_Message       Ev_Comment             EV_Custom1           Ev_Time_Ms     
-------------------------------------------------------------------------------------
Machine 1 Alarm  5/23/2016 11:02:00 AM  Alarms Scanned       25              
Machine 1 Alarm  5/23/2016 11:00:00 AM  Alarms Scanned       686 
Machine 1 Alarm  5/23/2016 11:00:00 AM  Light curtain        537
Machine 1 Alarm  5/23/2016 11:00:00 AM  Guard door open      346 
Machine 1 Alarm  5/23/2016 11:00:00 AM  No control voltage   135 
Machine 1 Alarm  5/23/2016 10:38:34 AM  Alarms Scanned       269
Machine 1 Alarm  5/23/2016 10:38:29 AM  Alarms Scanned       378
Machine 1 Alarm  5/23/2016 10:38:29 AM  Guard door open      156
Machine 1 Alarm  5/23/2016 10:38:25 AM  Alarms Scanned       654
Not an Alarm     5/23/2016 10:38:25 AM  Not an Alarm         467     
Machine 1 Alarm  5/23/2016 10:38:25 AM  Guard door open      234
Machine 1 Alarm  5/23/2016 10:38:25 AM  No control voltage   67
Machine 1 Alarm  5/23/2016 10:38:23 AM  Alarms Scanned       124
Machine 1 Alarm  5/23/2016 10:38:23 AM  No control voltage   100   

每次扫描警报时都会添加“警报扫描”行,每次触发或清除警报时。任何警报都会添加一行特定的Ev_Custom1。第一列Ev_Message包含一个机器ID,可以让我从不同的机器中分离出警报。 (你不喜欢任意的列名吗?)有超过九百种独特的警报信息。

我希望我的查询返回的内容是这样的:

Alarm Message       Alarm Start Time       Alarm Stop Time  
----------------------------------------------------------------  
No control voltage  5/23/2016 10:38:23 AM  5/23/2016 10:38:29 AM  
Guard door open     5/23/2016 10:38:25 AM  5/23/2016 10:38:34 AM  
No control voltage  5/23/2016 11:00:00 AM  5/23/2016 11:02:00 AM  
Guard door open     5/23/2016 11:00:00 AM  5/23/2016 11:02:00 AM  
Light curtain       5/23/2016 11:00:00 AM  5/23/2016 11:02:00 AM  

这是在两个日期之间过滤的查询。我有能力改变进入表格的数据,但有900个警报,我的自由是有限的。

在一些帮助下,我当前的查询是这样的:

WITH T AS (
    SELECT     s.Ev_Comment AS start_time,
               MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time
    FROM       A AS s
    INNER JOIN A AS e
            ON s.Ev_Comment < e.Ev_Comment
           AND s.Ev_Custom1 = 'Alarms Scanned'
           AND e.Ev_Custom1 = 'Alarms Scanned'
    GROUP BY   s.Ev_Comment)
SELECT     T_1.start_time,
           T_1.end_time,
           A.Ev_Custom1
FROM       A
INNER JOIN T AS T_1
        ON A.Ev_Comment LIKE T_1.start_time
WHERE      (A.Ev_Custom1 <> 'Alarms Scanned')

我还有一个问题。如果一个警报持续时间超过一个时期,比如10:38:25到10:38:34的“警卫门打开”那么它会出现在两个单独的行中,如下所示:

start_time             end_time               EV_Custom1   
---------------------  ---------------------  -------------
5/23/2016 10:38:25 AM  5/23/2016 10:38:29 AM  Guard door open
5/23/2016 10:38:29 AM  5/23/2016 10:38:34 AM  Guard door open

理想情况下,我想要的是:

start_time             end_time               EV_Custom1   
---------------------  ---------------------  -------------
5/23/2016 10:38:25 AM  5/23/2016 10:38:34 AM  Guard door open

我想我需要group by ((Ev_custom1) and (when end_time = start_time))(原谅我的伪代码),但我对此所需的语法知之甚少。

Here is an SQLFiddle

2 个答案:

答案 0 :(得分:2)

如果我正确理解发布的问题,那么您的CTE可以有效地确定所有警报的时间段(或间隔)。您的最终选择子句将实际警报信息与警报间隔结合在一起。 部分问题是如果您的警报长时间保持活动状态(我假设长于您的警报扫描周期),您的警报系统将继续记录“警报扫描”条目,这有效地导致活动警报被分开。 如果您有SQL Server 2012或更高版本,则可以相对轻松地确定警报事件是否已拆分。您只需检查警报的结束时间是否等于同一警报类型的下一个警报的开始时间。您可以在2012年使用LAG窗口功能实现此目的 下一步是生成一个ID,您可以将警报分组,以便您可以组合拆分事件。这是通过SUM OVER子句实现的。 以下示例显示了如何实现此目的:

;WITH AlarmTimeBuckets
AS 
(
    SELECT       EventStart.Ev_Comment AS StartDateTime 
                ,MIN(COALESCE (EventEnd.Ev_Comment, EventStart.Ev_Comment)) AS EndDateTime
                ,EventStart.Ev_Message As Machine
    FROM         A EventStart 
    INNER JOIN   A EventEnd ON EventStart.Ev_Comment < EventEnd.Ev_Comment AND EventStart.Ev_Custom1 = 'Alarms Scanned' AND EventEnd.Ev_Custom1 = 'Alarms Scanned' AND EventStart.Ev_Message = EventEnd.Ev_Message
    GROUP BY     EventStart.Ev_Message, EventStart.Ev_Comment
),
AlarmsByTimeBucket
AS
(
    SELECT      AlarmTimeBuckets.Machine
               ,AlarmTimeBuckets.StartDateTime
               ,AlarmTimeBuckets.EndDateTime 
               ,Alarm.Ev_Custom1 AS Alarm
               ,(
                 CASE
                    WHEN LAG(AlarmTimeBuckets.EndDateTime, 1, NULL) OVER (PARTITION BY Alarm.Ev_Custom1,Alarm.Ev_Message ORDER BY AlarmTimeBuckets.StartDateTime) = AlarmTimeBuckets.StartDateTime THEN 0
                    ELSE 1
                 END
                ) AS IsNewEvent
    FROM       A Alarm 
    INNER JOIN AlarmTimeBuckets  ON Alarm.Ev_Message = AlarmTimeBuckets.Machine AND  Alarm.Ev_Comment = AlarmTimeBuckets.StartDateTime
    WHERE     (Alarm.Ev_Custom1 <> 'Alarms Scanned')
)
,
AlarmsByGroupingID
AS
(
    SELECT   Machine
            ,StartDateTime
            ,EndDateTime
            ,Alarm
            ,SUM(IsNewEvent) OVER (ORDER BY Machine, Alarm, StartDateTime) AS GroupingID
    FROM    AlarmsByTimeBucket
)
SELECT       MAX(Machine) AS Machine
            ,MIN(StartDateTime) AS StartDateTime
            ,MAX(EndDateTime) AS EndDateTime
            ,MAX(Alarm) AS Alarm
FROM        AlarmsByGroupingID
GROUP BY    GroupingID
ORDER BY    StartDateTime

答案 1 :(得分:1)

我更新了您的sqlfiddle链接以及下面的更新。在最终结果集中,您需要设置row_number并在EV_CUSTOM1,START_TIME = END_TIME(如您所怀疑的)上连接回来,并且行号=行号+ 1。这是您可以确定两个事件是否在同一时期的方法。如果你在Sql Server 2012+上,你可以使用@EdmondQuinton在他的答案中指出LAG / LEAD函数,这会更简单。

WITH T AS (SELECT  s.Ev_Comment AS start_time, MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time           
           FROM A AS s 
           INNER JOIN A AS e 
           ON s.Ev_Comment < e.Ev_Comment 
           AND s.Ev_Custom1 = 'Alarms Scanned' 
           AND e.Ev_Custom1 = 'Alarms Scanned'
           GROUP BY s.Ev_Comment
          ),

T2 AS(SELECT T_1.start_time, T_1.end_time, A.Ev_Custom1,
             ROW_NUMBER() OVER (PARTITION BY EV_CUSTOM1 ORDER BY T_1.START_TIME) RN
      FROM  A 
      INNER JOIN
      T AS T_1 
      ON A.Ev_Comment LIKE T_1.start_time
      WHERE (A.Ev_Custom1 <> 'Alarms Scanned')
      )

select 
  coalesce(b.START_TIME, a.START_TIME) START_TIME, 
  max(a.END_TIME) END_TIME, 
  a.EV_CUSTOM1
from T2 a
left outer join T2 b
on a.EV_CUSTOM1 = b.EV_CUSTOM1
and a.START_TIME = b.END_TIME
and a.RN = b.RN+1
group by coalesce(b.START_TIME, a.START_TIME), 
         a.EV_CUSTOM1