我有一个表格,可以保持 process_id 对应的状态和 created_timestamp 。一旦过程状态改变,就插入一行。因此,存在与 process_id 相同的行,因为有与之关联的状态。
我想使用此数据创建另一个表格/视图,该数据的一行对应 process_id ,其当前状态和之前的状态。我需要为此创建一个Informatica作业,但SQL查询将同样有用。
样本输入:
Process_id | Status | Created
1 | In_queue | 2014-08-01 00:01:01
1 | Started | 2014-08-01 01:03:01
1 | In_process | 2014-08-01 01:50:20
1 | Complete | 2014-08-01 03:10:20
Sample Output:
Process_id | Previous_status | Current_status | Updated
1 | In_process | Complete | 2014-08-01 03:10:20
答案 0 :(得分:1)
SELECT Process_id, Previous_status, Current_status, Updated
FROM (
SELECT
Process_id,
Status AS Current_status,
Created AS Updated,
@prev_state AS Previous_status,
@prev_state := Status
FROM
your_table t
, (select @prev_state := null) var_init
WHERE Process_id = 1
ORDER BY Created
) sq
更新:
要为所有Process_id执行此操作,只需获取每个Process_id的最新记录,您可以使用它:
SELECT sq.Process_id, sq.Previous_status, sq.Current_status, sq.Updated
FROM (
SELECT
Process_id,
Status AS Current_status,
Created AS Updated,
@prev_state := if(@prev_process != Process_id, null, @prev_state),
@prev_state AS Previous_status,
@prev_state := Status,
@prev_process := Process_id
FROM
your_table t
, (select @prev_state := null, @prev_process := null) var_init
ORDER BY Process_id, Created
) sq
INNER JOIN (
SELECT Process_id, MAX(Created) AS max_created
FROM your_table
GROUP BY Process_id
) max_c
ON sq.Process_id = max_c.Process_id AND sq.Updated = max_c.max_created
答案 1 :(得分:0)
除了性能之外,我就是这样做的......
SELECT a.process_id
, a.status
, a.created
, b.status prev_status
, b.created prev_created
FROM
( SELECT x.*
, COUNT(*) rank
FROM my_table x
JOIN my_table y
ON y.process_id = x.process_id
AND y.created >= x.created
GROUP
BY x.process_id
, x.created
) a
LEFT
JOIN
( SELECT x.*
, COUNT(*) rank
FROM my_table x
JOIN my_table y
ON y.process_id = x.process_id
AND y.created >= x.created
GROUP
BY x.process_id
, x.created
) b
ON b.process_id = a.process_id
WHERE b.rank = a.rank + 1
AND a.rank = 1;
在更大的数据集上,我可能会选择更加精美的Pantsy解决方案