按连续值分区

时间:2013-09-05 14:53:27

标签: sql oracle10g

我有以下数据集:

Material  Operation  Txn_Date
M1004        100           8/25/2013 8:22:05 PM 
M1004        100           8/25/2013 8:34:37 PM
M1004        100           8/29/2013 9:03:01 PM
M1004        600           8/29/2013 11:48:01 PM
M1004        600           8/30/2013 7:48:34 AM
M1004        600           8/30/2013 8:32:00 AM
M1004        500           8/30/2013 9:38:35 AM
M1004        500           8/30/2013 9:54:52 AM
M1004        500           8/30/2013 10:07:35 AM
M1004        500           8/30/2013 11:53:28 AM
M1004        500           9/2/2013 2:30:56 AM
M1004        200           9/2/2013 2:59:20 AM
M1004        900           9/2/2013 3:30:11 AM
M1004        600           9/3/2013 10:35:01 AM
M1004        600           9/3/2013 10:48:24 AM
M1004        600           9/3/2013 11:17:00 AM

我试着获得一个等于操作集中第一个txn_date的值。我正在使用:

SELECT 
  Material, 
  Operation, 
  Txn_Date, 
  MIN(Txn_Date) OVER(PARTITION BY Material, Operation) AS first_txn_date
FROM oper_tab         
WHERE Material = 'M1004'
ORDER BY Txn_Date

问题是如果多次出现多次,我的分区将从第一组操作中获取第一个txn_date(如操作600)。我想我需要对这个表做些什么来表示连续的操作号,以便我可以将它添加到我的分区。

我可以在SELECT中做些什么来表示这种级别的分区?

1 个答案:

答案 0 :(得分:4)

您需要按操作集对行进行分组。

一种方法是使用在达到新“set”时增加的运行总计,如:

SQL> SELECT mat, op, dt,
  2         SUM(change_set) over (PARTITION BY mat ORDER BY dt) set_group
  3    FROM (SELECT mat, op, dt,
  4                 CASE WHEN op != lag(op) over (PARTITION BY mat
  5                                                   ORDER BY dt)
  6                      THEN 1
  7                 END change_set
  8            FROM DATA);

MAT           OP DT           SET_GROUP
----- ---------- ----------- ----------
M1004        100 25/08/2013  
M1004        100 25/08/2013  
M1004        100 29/08/2013  
M1004        600 29/08/2013           1
M1004        600 30/08/2013           1
M1004        600 30/08/2013           1
M1004        500 30/08/2013           2
M1004        500 30/08/2013           2
M1004        500 30/08/2013           2
M1004        500 30/08/2013           2
M1004        500 02/09/2013           2
M1004        200 02/09/2013           3
M1004        900 02/09/2013           4
M1004        600 03/09/2013           5
M1004        600 03/09/2013           5
M1004        600 03/09/2013           5

然后您可以使用新列的分析分区,这应该可以为您提供所需的内容:

SQL> SELECT mat, op, dt, MIN(dt) over (PARTITION BY mat, set_group) first_txn
  2    FROM (SELECT mat, op, dt,
  3                 SUM(change_set) over (PARTITION BY mat ORDER BY dt) set_group
  4            FROM (SELECT mat, op, dt,
  5                         CASE WHEN op != lag(op) over (PARTITION BY mat
  6                                                           ORDER BY dt)
  7                              THEN 1
  8                         END change_set
  9                    FROM DATA));

MAT           OP DT                   FIRST_TXN
----- ---------- -------------------- --------------------
M1004        600 29/08/2013 23:48:01  29/08/2013 23:48:01
M1004        600 30/08/2013 07:48:34  29/08/2013 23:48:01
M1004        600 30/08/2013 08:32:00  29/08/2013 23:48:01
M1004        500 30/08/2013 09:38:35  30/08/2013 09:38:35
M1004        500 30/08/2013 09:54:52  30/08/2013 09:38:35
M1004        500 30/08/2013 11:53:28  30/08/2013 09:38:35
M1004        500 02/09/2013 02:30:56  30/08/2013 09:38:35
M1004        500 30/08/2013 10:07:35  30/08/2013 09:38:35
M1004        200 02/09/2013 02:59:20  02/09/2013 02:59:20
M1004        900 02/09/2013 03:30:11  02/09/2013 03:30:11
M1004        600 03/09/2013 10:35:01  03/09/2013 10:35:01
M1004        600 03/09/2013 10:48:24  03/09/2013 10:35:01
M1004        600 03/09/2013 11:17:00  03/09/2013 10:35:01
M1004        100 25/08/2013 20:22:05  25/08/2013 20:22:05
M1004        100 25/08/2013 20:34:37  25/08/2013 20:22:05
M1004        100 29/08/2013 21:03:01  25/08/2013 20:22:05