SQL Server:按顺序删除同一房间的冗余行

时间:2016-02-08 22:00:54

标签: sql sql-server

拥有跟踪人员通过床位移动的数据,例如

PersonSK   ArrivalDttm          Room    Sequence 
------------------------------------------------
11111      01/01/2015 15:00     Bed 1       1
11111      01/01/2015 18:00     Bed 1       2
11111      01/01/2015 21:00     Bed 1       3
11111      01/01/2015 22:00     Bed 7       4

所需的输出

PersonSK   ArrivalDttm          Room    Sequence  Departure dttm
----------------------------------------------------------------
11111      01/01/2015 15:00     Bed 1       1     01/01/2015 22:00
11111      01/01/2015 22:00     Bed 7       2     NULL

无法想到这样做的方法,我想强加的逻辑如下:

  • 选择最小序列/到达dttm,其中床号改变但所有先前的床顺序与第一个相同

更新:根据提供的答案我的实际解决方案

WITH cte_bed_moves as (

SELECT
 movements.[Facility (Location)]
,movements.[Person Id]
,movements.[Visit Id]
,movements.[Room (Tracking Location)]
,movements.[Location Sequence Number] 
,movements.[Arrival to Location Dt/Tm] as arrival_dttm
,min_next_bed.arrival_dttm as end_dttm
FROM 
edcs_firstnet_bed_movements AS movements OUTER APPLY 
(/*Find next bed that is not the same type as the current*/
SELECT MIN(apply_nextBed.[Arrival to Location Dt/Tm]) as arrival_dttm
FROM edcs_firstnet_bed_movements AS apply_nextBed
WHERE
    movements.[Facility (Location)] = apply_nextBed.[Facility (Location)]
    AND movements.[Person Id] = apply_nextBed.[Person Id]
    AND movements.[Visit Id] = apply_nextBed.[Visit Id]
    AND apply_nextBed.[Location Sequence Number] > movements.[Location       Sequence Number]
    AND apply_nextBed.[Room (Tracking Location)] <> movements.[Room (Tracking Location)]
 ) as min_next_bed
)

/*for each bed, get rid of the duplicates with times inbetween*/
select 
 [Facility (Location)] as facility_name
,[Person Id] as person_id
,[Visit Id] as ed_visit_id
,[Room (Tracking Location)] as room
,end_dttm
,min(arrival_dttm) as arrival_dttm
from cte_bed_moves
group by
[Facility (Location)]
,[Person Id]
,[Visit Id]
,[Room (Tracking Location)]
,end_dttm

3 个答案:

答案 0 :(得分:2)

假设您的表名为person_dttm,则以下内容应该有效:

SELECT arr.PersonSK, MIN(arr.ArrivalDttm) as StartDttm,
(SELECT MIN(p2.ArrivalDttm)
 FROM person_dttm p2 
 WHERE p2.PersonSK=arr.PersonSK AND p2.ArrivalDttm > arr.ArrivalDttm AND p2.Room <> arr.Room) as EndDttm
FROM person_dttm arr
GROUP BY arr.PersonSK, arr.Room
ORDER BY arr.PersonSK, arr.ArrivalDttm

基本思路是选择列表中的人和他们最早到达每个房间。然后添加一个子查询,选择同一患者的所有记录的最小到达时间,不包括同一房间的行,并排除先前发生的行。

答案 1 :(得分:0)

使用邻居行连接表:

SELECT T1.ArrivalDttm ArrivalDttm, T2.ArrivalDttm DepartureDttm, ...
FROM T AS T1
LEFT JOIN T AS T2 ON
  T1.PersonSK = T2.PersonSK AND T1.sequence + 1 = T2.sequence
...
WHERE
  T1.Room != T2.Room

之后你将不得不计算新的序列号,最简单的方法是再次加入子查询:

... T3.sequence sequence
...
LEFT JOIN (
  SELECT COUNT(DISTINCT tmp.Room) AS sequence, tmp.PersonSK
  FROM T AS tmp
  WHERE tmp.ArrivalDttm <= T1.ArrivalDttm AND tmp.PersonSK = T1.PersonSK
) T3 ON T1.PersonSK = T3.PersonSK
...

答案 2 :(得分:0)

以下是使用分析函数的两分钱:

select b.PersonSK, 
       b2.ArrivalDttm,
       b.Room,
       row_number() over (partition by b.PersonSk 
                           order by b2.ArrivalDttm) as "Sequence",
       lead(b2.ArrivalDttm) over (partition by b.PersonSk 
                                   order by b2.ArrivalDttm) as "Departure dttm"
  from beds b
        INNER JOIN 
         (SELECT PersonSK,
                 room, 
                 min(ArrivalDttm) ArrivalDttm
            FROM beds
           GROUP by PersonSK, room) b2 
        ON b.PersonSK = b2.PersonSK
           AND b.room = b2.room
           AND b.ArrivalDttm = b2.ArrivalDttm

由于Sequence是保留字,因此需要用双引号将其包装起来。与"Departure dttm"相同,因为您没有带空格的别名。

这里正在研究SQLFiddle:http://sqlfiddle.com/#!15/4940a/1

请注意,我使用postgresql,因为sqlserver不稳定。 sintaxe是一样的。