TSQL - 初始查询的"重复" /误报的运行日期比较?

时间:2016-02-08 21:01:16

标签: sql sql-server tsql datetime subquery



  FROM event
  LEFT JOIN asset
         ON event.a_key = asset.a_key
         ON event.l_key = l.l_key

  WHERE event.e_key IN (350, 352, 378)

  ORDER BY asset.a_id, event.created_date




a_id 1124 created 2016-02-01 12:30:30
a_id 1124 created 2016-02-01 12:35:31
a_id 1124 created 2016-02-01 12:40:33
a_id 1124 created 2016-02-01 12:45:42
a_id 1124 created 2016-02-02 12:30:30
a_id 1124 created 2016-02-02 13:00:30
a_id 1115 created 2016-02-01-12:30:30


a_id 1124 created 2016-02-01 12:30:30 
a_id 1124 created 2016-02-02 12:30:30 
a_id 1124 created 2016-02-02 13:00:30 
a_id 1115 created 2016-02-01-12:30:30

我尝试引用thisthis,但我不能让那些概念适用于我。我知道我可能需要做一个SELECT * FROM(我现有的查询),但我似乎无法做到这一点而不会结束大量的"多部分标识符无法绑定&#34 ;错误(我没有创建临时表的经验,到目前为止我的尝试失败了)。我也不确定如何使用DATEDIFF作为日期过滤功能。


2 个答案:

答案 0 :(得分:2)


--sample data since I don't have your table structure and your original query won't work for me
declare @events table
  id int,
  timestamp datetime

--note that I changed some of your sample data to test some different scenarios
insert into @events values( 1124, '2016-02-01 12:30:30')
insert into @events values( 1124, '2016-02-01 12:35:31')
insert into @events values( 1124, '2016-02-01 12:40:33')
insert into @events values( 1124, '2016-02-01 13:05:42')
insert into @events values( 1124, '2016-02-02 12:30:30')
insert into @events values( 1124, '2016-02-02 13:00:30')
insert into @events values( 1115, '2016-02-01 12:30:30')

--using a cte here to split the result set of your query into groups
--by id (you would want to partition by whatever criteria you use
--to determine that rows are talking about the same event)
--the row_number function gets the row number for each row within that 
--id partition
--the over clause specifies how to break up the result set into groups 
--(partitions) and what order to put the rows in within that group so 
--that the numbering stays consistant
;with orderedEvents as
    select id, timestamp, row_number() over (partition by id order by timestamp) as rn
    from @events
    --you would replace @events here with your query
--using a second recursive cte here to determine which rows are "good"
--and which ones are not.  
, previousGoodTimestamps as 
    --this is the "seeding" part of the recursive cte where I pick the
    --first rows of each group as being a desired result.  Since they 
    --are the first in each group, I know they are good.  I also assign
    --their timestamp as the previous good timestamp since I know that 
    --this row is good.
    select id, timestamp, rn, timestamp as prev_good_timestamp, 1 as is_good
    from orderedEvents
    where rn = 1

    union all

    --this is the recursive part of the cte.  It takes the rows we have
    --already added to this result set and joins those to the "next" rows
    --(as defined by our ordering in the first cte).  Then we output
    --those rows and do some calculations to determine if this row is 
    --"good" or not.  If it is "good" we set it's timestamp as the
    --previous good row timestamp so that rows that come after this one 
    --can use it to determine if they are good or not.  If a row is "bad"
    --we just forward along the last known good timestamp to the next row.
    --We also determine if a row is good by checking if the last good row
    --timestamp plus 30 minutes is less than or equal to the current row's
    --timestamp.  If it is then the row is good.
    select e2.id
        , e2.timestamp
        , e2.rn
        , last_good_timestamp.timestamp
        , case
            when dateadd(mi, 30, last_good_timestamp.timestamp) <= e2.timestamp then 1
            else 0
    from previousGoodTimestamps e1
    inner join orderedEvents e2 on e2.id = e1.id and e2.rn = e1.rn + 1
    --I used a cross apply here to calculate the last good row timestamp
    --once.  I could have used two identical subqueries above in the select
    --and case statements, but I would rather not duplicate the code.
    cross apply
        select case 
                 when e1.is_good = 1 then e1.timestamp --if the last row is good, just use it's timestamp
                 else e1.prev_good_timestamp --the last row was bad, forward on what it had for the last good timestamp
               end as timestamp
    ) last_good_timestamp
select *
from previousGoodTimestamps
where is_good = 1 --only take the "good" rows


答案 1 :(得分:0)

-- Sample data.
declare @Samples as Table ( Id Int Identity, A_Id Int, CreatedDate DateTime );
insert into @Samples ( A_Id, CreatedDate ) values
  ( 1124, '2016-02-01 12:30:30' ),
  ( 1124, '2016-02-01 12:35:31' ),
  ( 1124, '2016-02-01 12:40:33' ),
  ( 1124, '2016-02-01 12:45:42' ),
  ( 1124, '2016-02-02 12:30:30' ),
  ( 1124, '2016-02-02 13:00:30' ),
  ( 1125, '2016-02-01 12:30:30' );
select * from @Samples;

-- Calculate the windows of 30 minutes before and after each   CreatedDate   and check for conflicts with other rows.
with Ranges as (
  select Id, A_Id, CreatedDate,
    DateAdd( minute, -30, S.CreatedDate ) as RangeStart, DateAdd( minute, 30, S.CreatedDate ) as RangeEnd
    from @Samples as S )
  select Id, A_Id, CreatedDate, RangeStart, RangeEnd,
    -- Check for a conflict with another row with:
    --   the same   A_Id   value and an earlier   CreatedDate   that falls inside the +/-30 minute range.
    case when exists ( select 42 from @Samples where A_Id = R.A_Id and CreatedDate < R.CreatedDate and R.RangeStart < CreatedDate and CreatedDate < R.RangeEnd ) then 1
      else 0 end as Conflict
    from Ranges as R;