当偏移量可变时,如何获取“下一个”事件?

时间:2014-07-31 16:02:40

标签: sql oracle analytic-functions

我在Oracle数据库中有一个事务表。我正在尝试为包含多种交易类型的交付系统一起提交报告。 “请求”类型实际上可以是四个子类型之一(本例中为“A”,“B”,“C”和“D”),“交付”类型可以是四个不同子类型中的一个类型('PULL','PICKUP','MAIL')。可以有1到5个交易从“请求”到“交付”获取项目,并且许多“交付”类型也是中间交易。例如:

Item | Transaction | Timestamp
001  | REQ-A       | 2014-07-31T09:51:32Z
002  | REQ-B       | 2014-07-31T09:55:53Z
003  | REQ-C       | 2014-07-31T10:01:15Z
004  | REQ-D       | 2014-07-31T10:02:29Z
005  | REQ-A       | 2014-07-31T10:05:47Z
002  | PULL        | 2014-07-31T10:20:04Z
002  | MAIL        | 2014-07-31T10:20:06Z
001  | PULL        | 2014-07-31T10:22:21Z
001  | TRANSFER    | 2014-07-31T10:22:23Z
003  | PULL        | 2014-07-31T10:24:10Z
003  | TRANSFER    | 2014-07-31T10:24:12Z
004  | PULL        | 2014-07-31T10:26:28Z
005  | PULL        | 2014-07-31T10:28:42Z
005  | TRANSFER    | 2014-07-31T10:28:44Z
001  | ARRIVE      | 2014-07-31T11:45:01Z
001  | PICKUP      | 2014-07-31T11:45:02Z
003  | ARRIVE      | 2014-07-31T11:47:44Z
003  | PICKUP      | 2014-07-31T11:47:45Z
005  | ARRIVE      | 2014-07-31T11:49:45Z
005  | PICKUP      | 2014-07-31T11:49:46Z

我需要的是一份报告:

Item | Start Tx | End Tx | Time
001  | REQ-A    | PICKUP | 1:53:30
002  | REQ-B    | MAIL   | 0:24:13
003  | REQ-C    | PICKUP | 1:46:30
004  | REQ-D    | PULL   | 0:23:59
005  | REQ-A    | PICKUP | 1:43:59

我有什么:

Item | Start Tx | End Tx   | Time
001  | REQ-A    | PULL     | 0:30:49
001  | REQ-A    | TRANSFER | 0:30:51
001  | REQ-A    | ARRIVE   | 1:53:29
001  | REQ-A    | PICKUP   | 1:53:30
002  | REQ-B    | PULL     | 0:24:11
002  | REQ-B    | MAIL     | 0:24:13
003  | REQ-C    | PULL     | 0:22:55
003  | REQ-C    | TRANSFER | 0:22:57
003  | REQ-C    | ARRIVE   | 1:46:29
003  | REQ-C    | PICKUP   | 1:46:30
004  | REQ-D    | PULL     | 0:23:59
005  | REQ-A    | PULL     | 0:22:55
005  | REQ-A    | TRANSFER | 0:22:57
005  | REQ-A    | ARRIVE   | 1:43:58
005  | REQ-A    | PICKUP   | 1:43:59

我正在做什么来获取这些数据:

SELECT Item, Transaction, nextTransaction, nextTimestamp - Timestamp
FROM (
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 5) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 5) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 4) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 4) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 3) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 3) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 2) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 2) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 1) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 1) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
)
WHERE nextTransaction IS NOT NULL
AND Transaction IN ('REQ-A', 'REQ-B', 'REQ-C', 'REQ-D')

我可以在脚本中手动解析这个(也许这实际上是最好的行动方案),但为了学习,我想知道是否可以单独使用SQL来实现这一点。

2 个答案:

答案 0 :(得分:4)

听起来你想要基于时间的第一笔和最后一笔交易。我认为以下是您想要的:

select item,
       min(transaction) keep (dense_rank first order by timestamp) as StartTx, 
       min(transaction) keep (dense_rank last order by timestamp) as EndTx,
       max(timestamp) - min(timestamp)
from transactions t
group by item;

答案 1 :(得分:1)

您可以使用first_value分析函数代替lead

select item, start_tran, end_tran, end_time - start_time
from (
  select item,
    first_value(transaction) over (partition by item
      order by timestamp) as start_tran,
    first_value(timestamp) over (partition by item
      order by timestamp) as start_time,
    first_value(transaction) over (partition by item
       order by timestamp desc) as end_tran,
    first_value(timestamp) over (partition by item
       order by timestamp desc) as end_time,
    row_number() over (partition by item
       order by timestamp) as rn
  from transactions
)
where rn = 1
order by item;

      ITEM START_TRAN END_TRAN   END_TIME-START_TIME
---------- ---------- ---------- -------------------
         1 REQ-A      PICKUP     0 1:53:30.0         
         2 REQ-B      MAIL       0 0:24:13.0         
         3 REQ-C      PICKUP     0 1:46:30.0         
         4 REQ-D      PULL       0 0:23:59.0         
         5 REQ-A      PICKUP     0 1:43:59.0         

row_number代替distinct。内部查询为原始表中的每一行生成一行,只有item和分析函数结果,因此每个item都是相同的;第1项有五个相同的行,每行显示第一个和最后一个transaction以及相应的timestamp。外部查询主要折叠那些,但也会使用时间戳减法来获取经过的时间间隔。

SQL Fiddle没有很好地显示间隔,但如果您想以不同方式呈现,则可以从中提取小时/分钟/秒值。或者,如果列实际上是日期,那么您可以just use to_char() of course