给定SQL中的初始匹配,如何在有条件的表中查找与条件匹配的下一行

时间:2019-05-03 21:30:17

标签: sql

我正在查询一个包含状态引擎状态转换的表。设置该表以使其具有过渡的previous_statecurrent_statetimestamp,并按唯一的id分组。

我的目标是找到一个目标间隔序列,定义为初始状态转换的时间戳(例如,从1-> 2转换时的时间戳),以及与特定条件匹配的目标下一状态转换的时间戳(例如下一个时间戳记current_state = 3或current_state = 4)。

state_transition_table
+------------+---------------+-----------+----+
| prev_state | current_state | timestamp | id |
+------------+---------------+-----------+----+
|          1 |             2 |       4.5 |  1 |
|          2 |             3 |       5.2 |  1 |
|          3 |             1 |       5.4 |  1 |
|          1 |             2 |      10.3 |  1 |
|          2 |             5 |      10.4 |  1 |
|          5 |             4 |      10.8 |  1 |
|          4 |             1 |      11.0 |  1 |
|          1 |             2 |      12.3 |  1 |
|          2 |             3 |      13.5 |  1 |
|          3 |             1 |      13.6 |  1 |
+------------+---------------+-----------+----+

在给定的id内,我们要查找所有以1-> 2(足够容易的查询)开始并以状态3或4结尾的间隔。 1-> 2-> 任何-> 3或4

上面给出的输入的示例输出表将具有三个状态以及在这些状态之间进行转换时的时间戳:

target output
+------------+---------------+------------+-----------+-----------+
| prev_state | current_state | end_state  | curr_time | end_time  |
+------------+---------------+------------+-----------+-----------+
|          1 |             2 |          3 |       4.5 |       5.2 |
|          1 |             2 |          4 |      10.3 |      10.8 |
|          1 |             2 |          3 |      12.3 |      13.5 |
+------------+---------------+------------+-----------+-----------+

我能想到的最好的查询是在子表中使用窗口函数,然后从该表中创建新列。但是此解决方案 only 仅在初始转换后找到下一行,并且不允许在此之后到目标状态到达时出现其他状态。

WITH state_transitions as (
SELECT
  id
  previous_state, current_state,
  LEAD(current_state) OVER ( PARTITION BY id ORDER BY timestamp) AS end_state,
  timestamp as curr_time,
  LEAD(timestamp) OVER ( PARTITION BY id ORDER BY timestamp) AS end_time
FROM
  state_transition_table

SELECT
  previous_state,
  current_state,
  end_state,
  curr_time,
  end_time
FROM state_transitions
WHERE previous_state=1 and current_state=2
ORDER BY curr_time

此查询将错误地给出第二个输出行end_state==5,这不是我想要的。

一个人如何在表中搜索与我的目标条件匹配的下一行,例如end_state=3 OR end_state=4

1 个答案:

答案 0 :(得分:1)

这需要一个递归查询,以对照兄弟姐妹检查每一行。该查询应占三行以上。我以ORACLE作为种子数据,但是您应该只使用表。我试图以最佳方式记录该查询。

with /*SEED DATA*/
  state_transition_table(prev_state, current_state, timestamp, id) as (
              SELECT          1 ,             2 ,       4.5 ,  1 FROM DUAL
    UNION ALL SELECT          2 ,             3 ,       5.2 ,  1 FROM DUAL
    UNION ALL SELECT          3 ,             1 ,       5.4 ,  1 FROM DUAL
    UNION ALL SELECT          1 ,             2 ,      10.3 ,  1 FROM DUAL
    UNION ALL SELECT          2 ,             5 ,      10.4 ,  1 FROM DUAL
    UNION ALL SELECT          5 ,             4 ,      10.8 ,  1 FROM DUAL
    UNION ALL SELECT          4 ,             1 ,      11.0 ,  1 FROM DUAL
    UNION ALL SELECT          1 ,             2 ,      12.3 ,  1 FROM DUAL
    UNION ALL SELECT          2 ,             3 ,      13.5 ,  1 FROM DUAL
    UNION ALL SELECT          3 ,             1 ,      13.6 ,  1 FROM DUAL
)

/*THE END STATES YOU ARE LOOKING FOR*/
, end_states (a_state) as (
              select 3 from dual
    union all select 4 from dual
)

/*ORDER THE STEPS TO USE THE order_id COLUMN TO EVALUATE THE NEXT NODE*/
, ordered_states as (
    SELECT rownum order_id
         , prev_state
         , current_state
         , timestamp
         , id
    FROM   state_transition_table
    ORDER BY timestamp
)

/*RECURSIVE QUERY WITH ANSI SYNTAX*/
, recursive (
           root_order_id
         , order_id
         , timestamp
         , prev_state
         , current_state
         --, id

         , steps
  )
as (
    SELECT order_id root_order_id /*THE order_id OF EACH ROOT ROW*/
         , order_id
         , timestamp
         , prev_state
         , current_state

         , to_char(order_id) as steps /*INITIAL VALIDATION PATH*/
    FROM   ordered_states
    WHERE  prev_state = 1 AND current_state = 2 /*INITIAL CONDITION*/

    UNION ALL
    SELECT prev.root_order_id
         , this.order_id
         , this.timestamp
         , prev.prev_state
         , this.current_state

         , prev.steps || ', ' || to_char(this.order_id)
    FROM   recursive prev /*ANSI PSEUDO TABLE*/
         , ordered_states this /*THE SIBLING ROW TO CHECK*/

    WHERE prev.order_id = this.order_id - 1 /*ROW TO PREVIOUS ROW JOIN*/
      and prev.current_state not in (select a_state from end_states) /*THE PREVIOUS ROW STATE IS NOT AN END STATE */
)

select init_state.prev_state
     , init_state.current_state as mid_state /*this name is better, I think*/
     , end_state.current_state
     , init_state.timestamp as initial_time /*initial_time is better, I think*/
     , end_state.timestamp as end_time /*end_time is better, I think*/
     , recursive.steps as validation_path_by_order_id
from   recursive
inner join ordered_states init_state
    on init_state.order_id = recursive.root_order_id
inner join ordered_states end_state
    on end_state.order_id = recursive.order_id
where  recursive.current_state in (select a_state from end_states)

最后的笔记。结果列仅占3行(prev_state,mid_state和current_state)。就像我在上面说的,在某些情况下,您可以具有从(1)到(2)到(3或4)的路径,且该路径具有多于三行,比如说1到2到5到2到3,因此mid_state是真的只是中间的一种状态。

最后注:您想要的结果表是错误的,但您已对其进行了更正。