查找每次预订最长停留时间段

时间:2013-07-15 11:44:22

标签: sql sql-server-2008 tsql

我们有许多预订,其中一项要求是我们根据细分市场显示预订的最终目的地。我们的业务已将最终目的地定义为我们最长停留时间。而Origin是第一个出发点。

请注意,这不是具有最长旅行时间的段,即Datediff(minute, DepartDate, ArrivalDate)这是请求段之间间隔最长的段。

这是表格的简化版本:

Create Table Segments
(
  BookingID int,
  SegNum int,
  DepartureCity varchar(100),
  DepartDate datetime,
  ArrivalCity varchar(100),
  ArrivalDate datetime
);

Create Table Bookings
(
 BookingID int identity(1,1),
 Locator varchar(10)
);

Insert into Segments values (1,2,'BRU','2010-03-06 10:40','FIH','2010-03-06 20:20:00')
Insert into Segments values (1,4,'FIH','2010-03-13 21:50:00','BRU', '2010-03-14 07:25:00')
Insert into Segments values (2,2,'BOD','2010-02-10 06:50:00','AMS','2010-02-10 08:50:00')
Insert into Segments values (2,3,'AMS','2010-02-10 10:40:00','EBB','2010-02-10 20:40:00')
Insert into Segments values (2,4,'EBB','2010-02-28 22:55:00','AMS','2010-03-01 05:35:00')
Insert into Segments values (2,5,'AMS','2010-03-01 10:25:00','BOD','2010-03-01 12:15:00')
insert into Segments values (3,2,'BRU','2010-03-09 12:10:00','IAD','2010-03-09 14:46:00')
Insert into Segments Values  (3,3,'IAD','2010-03-13 17:57:00','BRU','2010-03-14 07:15:00')
insert into segments values (4,2,'BRU','2010-07-27','ADD','2010-07-28')
insert into segments values (4,4,'ADD','2010-07-28','LUN','2010-07-28')
insert into segments values (4,5,'LUN','2010-08-23','ADD','2010-08-23')
insert into segments values (4,6,'ADD','2010-08-23','BRU','2010-08-24')


Insert into Bookings values('5MVL7J')
Insert into Bookings values ('Y2IMXQ')
insert into bookings values ('YCBL5C')
Insert into bookings values ('X7THJ6')

我在这里创建了一个带有实际数据的SQL小提琴: SQL Fiddle Example

我试图执行以下操作,但这似乎不正确。

 SELECT Locator, fd.*
FROM Bookings ob
OUTER APPLY
(
SELECT Top 1 DepartureCity, ArrivalCity
from
(
SELECT DISTINCT
    seg.segnum ,
    seg.DepartureCity ,
    seg.DepartDate ,
    seg.ArrivalCity ,
    seg.ArrivalDate,
(SELECT
DISTINCT
    DATEDIFF(MINUTE , seg.ArrivalDate , s2.DepartDate)
FROM Segments s2
WHERE s2.BookingID = seg.BookingID AND s2.segnum = seg.segnum + 1) 'LengthOfStay'
    FROM Bookings b(NOLOCK)
    INNER JOIN Segments seg (NOLOCK) ON seg.bookingid = b.bookingid
    WHERE b.Locator = ob.locator
  ) a
Order by a.lengthofstay desc
  )
FD

我期望的结果是:

Locator   Origin   Destination 

5MVL7J    BRU      FIH

Y2IMXQ    BOD      EBB

YCBL5C    BRU      IAD

X7THJ6    BRU      LUN

我觉得CTE是最好的方法,但是到目前为止我的尝试都是失败的。任何帮助将不胜感激。

我设法让以下查询正常工作,但由于排名靠前,它一次仅适用于一个查询,但我不确定如何调整它:

WITH CTE AS 
(
    SELECT distinct s.DepartureCity, s.DepartDate, s.ArrivalCity, s.ArrivalDate, b.Locator , ROW_NUMBER() OVER (PARTITION BY b.Locator ORDER BY SegNum ASC) RN 
    FROM Segments s
    JOIN bookings b ON s.bookingid = b.BookingID
)
SELECT C.Locator, c.DepartureCity, a.ArrivalCity
FROM 
(
SELECT TOP 1 C.Locator, c.ArrivalCity, c1.DepartureCity, DATEDIFF(MINUTE,c.ArrivalDate, c1.DepartDate) 'ddiff'
FROM CTE c
JOIN cte c1 ON c1.Locator = C.Locator AND c1.rn = c.rn + 1
ORDER BY ddiff DESC
) a
JOIN CTE c ON C.Locator = a.Locator
WHERE c.rn = 1

3 个答案:

答案 0 :(得分:3)

您可以尝试这样的事情:

;WITH CTE_Start AS 
(
    --Ordering of segments to eliminate gaps
    SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY SegNum) RN 
    FROM dbo.Segments  
)
, RCTE_Stay AS 
(
    --recursive CTE to calculate stay between segments
    SELECT *, 0 AS Stay FROM CTE_Start s WHERE RN = 1
    UNION ALL
    SELECT sNext.*, DATEDIFF(Mi, s.ArrivalDate, sNext.DepartDate) 
    FROM CTE_Start sNext
    INNER JOIN RCTE_Stay s ON s.RN + 1 = sNext.RN AND s.BookingID = sNext.BookingID
)
, CTE_Final AS
(
    --Search for max(stay) for each bookingID
    SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY Stay DESC) AS RN_Stay 
    FROM RCTE_Stay
)
--join Start and Final on RN=1 to find origin and departure
SELECT b.Locator, s.DepartureCity AS Origin, f.DepartureCity AS Destination
FROM CTE_Final f
INNER JOIN CTE_Start s ON f.BookingID = s.BookingID
INNER JOIN dbo.Bookings b ON b.BookingID = f.BookingID
WHERE s.RN = 1 AND f.RN_Stay = 1

<强> SQLFiddle DEMO

答案 1 :(得分:3)

您可以使用OUTER APPLY + TOP运算符查找下一个值SegNum。在找到段之间的间隙后,使用MIN / MAX聚合函数,将OVER子句作为CASE表达式中的条件

;WITH cte AS
 (
  SELECT seg.BookingID,
         CASE WHEN MIN(seg.segNum) OVER(PARTITION BY seg.BookingID) = seg.segNum 
              THEN seg.DepartureCity END AS Origin,
         CASE WHEN MAX(DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)) OVER(PARTITION BY seg.BookingID) 
           = DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)
              THEN o.DepartureCity END AS Destination
  FROM Segments seg (NOLOCK)
    OUTER APPLY (
                 SELECT TOP 1 seg2.DepartDate, seg2.DepartureCity
                 FROM Segments seg2
                 WHERE seg.BookingID = seg2.BookingID 
                   AND seg.SegNum < seg2.SegNum
                 ORDER BY seg2.SegNum ASC
                 ) o
  )
  SELECT b.Locator, MAX(c.Origin) AS Origin, MAX(c.Destination) AS Destination
  FROM cte c JOIN Bookings b ON c.BookingID = b.BookingID
  GROUP BY b.Locator

请参阅SQLFiddle

上的演示

答案 2 :(得分:0)

以下声明:

;WITH DataSource AS
(

  SELECT ROW_NUMBER() OVER(PARTITION BY BookingID ORDER BY DATEDIFF(SS,DepartDate,ArrivalDate) DESC) AS Row
        ,Segments.BookingID
        ,Segments.SegNum
        ,Segments.DepartureCity
        ,Segments.DepartDate
        ,Segments.ArrivalCity
        ,Segments.ArrivalDate
        ,DATEDIFF(SS,DepartDate,ArrivalDate) AS DiffInSeconds
  FROM Segments
)
SELECT * 
FROM DataSource DS
INNER JOIN Bookings B
  ON DS.[BookingID] = B.[BookingID]

将提供以下输出:

enter image description here

因此,在上述声明中添加以下子句:

WHERE Row = 1

会为您提供所需的信息。

很少有重要的事情:

  1. 从下面的屏幕截图中可以看出,有两条记录在第二条记录中有相同的区别。如果您想要显示它们(或者如果有的话),而不是ROW_NUMBER 函数使用RANK函数。

  2. DATEDIFF的返回类型为INT。因此,秒最大参考值存在限制。它如下:

  3.   

    如果返回值超出int的范围(-2,147,483,648到   +2,147,483,647),返回错误。对于毫秒,startdate和enddate之间的最大差异是24天,20小时,31   分钟和23.647秒。第二,最大差异是68   年。