返回每个序列的唯一行,可以是0到n

时间:2015-10-28 14:26:56

标签: sql oracle11g

我有一个要求,我需要对序列中的所有行进行分组。所以序列可以是2015年的日期

instr_id    unit    typ         date         market     seq
Mht4o.jI01  26.55   ASKED       24-FEB-15   NYSE000000  0
Mht4o.jI01  26.55   ASKED       26-FEB-15   NYSE000000  1
Mht4o.jI01  26.55   ASKED       27-FEB-15   NYSE000000  2
Mht4o.jI01  26.3    BID         24-FEB-15   NYSE000000  0
Mht4o.jI01  26.3    BID         26-FEB-15   NYSE000000  1
Mht4o.jI01  26.55   ASKED       06-MAR-15   NYSE000000  0
Mht4o.jI01  26.55   ASKED       07-MAR-15   NYSE000000  1

我希望sql只返回3行,ASKED的前三行是同一序列的一部分,所以应该合并为1行,然后是1行用于BID,最后2行是同一序列的一部分所以应该合并为1行。另请注意,周末不会插入任何行。

对于上述数据,

结果应如下所示

instr_id    typ   start date end date     MAX(seq)
Mht4o.jI01  ASKED 24-FEB-15  27-FEB-15      2
Mht4o.jI01  BID   24-FEB-15  26-FEB-15      1
Mht4o.jI01  ASKED 06-MAR-15  07-MAR-15      1

这可能吗?

2 个答案:

答案 0 :(得分:2)

假设您的表格与呈现完全一致,序列号与日期相对应(而不是我们必须在查询中生成的内容),那么您可以使用Tabibitosan来计算分组:

with sample_data as (select to_date('30/07/2015', 'dd/mm/yyyy') dt, 0 seq from dual union all
                     select to_date('31/07/2015', 'dd/mm/yyyy') dt, 1 seq from dual union all
                     select to_date('03/08/2015', 'dd/mm/yyyy') dt, 2 seq from dual union all
                     select to_date('04/08/2015', 'dd/mm/yyyy') dt, 0 seq from dual union all
                     select to_date('05/08/2015', 'dd/mm/yyyy') dt, 1 seq from dual union all
                     select to_date('06/08/2015', 'dd/mm/yyyy') dt, 2 seq from dual union all
                     select to_date('07/08/2015', 'dd/mm/yyyy') dt, 3 seq from dual)
-- end of mimicking your example data in a table called "sample_data". See SQL below:
select min(dt) start_date,
       max(dt) end_date,
       max(seq) max_sequence
from   (select dt,
               seq,
               row_number() over (order by dt) - seq grp
        from   sample_data)
group by grp;

START_DATE END_DATE   MAX_SEQUENCE
---------- ---------- ------------
30/07/2015 03/08/2015            2
04/08/2015 07/08/2015            3

根据您的更新数据,它仍然非常简单:

with sample_data as (select 'Mht4o.jI01' instr_id, 26.55 unit, 'ASKED' typ, to_date('24/02/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 0 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.55 unit, 'ASKED' typ, to_date('26/02/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 1 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.55 unit, 'ASKED' typ, to_date('27/02/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 2 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.3 unit, 'BID' typ, to_date('24/02/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 0 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.3 unit, 'BID' typ, to_date('26/02/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 1 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.55 unit, 'ASKED' typ, to_date('06/03/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 0 seq from dual union all
                     select 'Mht4o.jI01' instr_id, 26.55 unit, 'ASKED' typ, to_date('07/03/2015', 'dd/mm/yyyy') dt, 'NYSE000000' market, 1 seq from dual)
select   instr_id,
         typ,
         min(dt) start_date,
         max(dt) end_date,
         max(seq)
from     (select instr_id,
                 typ,
                 dt,
                 seq,
                 row_number() over (partition by instr_id, typ order by dt) - seq grp
          from   sample_data)
group by instr_id,
         typ,
         grp
order by 1, 3, 2;

INSTR_ID   TYP   START_DATE END_DATE     MAX(SEQ)
---------- ----- ---------- ---------- ----------
Mht4o.jI01 ASKED 24/02/2015 27/02/2015          2
Mht4o.jI01 BID   24/02/2015 26/02/2015          1
Mht4o.jI01 ASKED 06/03/2015 07/03/2015          1

答案 1 :(得分:0)

您可以使用lag()查看前一个值来识别序列。当序列意外更改时,则会启动一个新组。通过累积此开始标志,您可以识别组。其余的只是聚合:

select min(date) as startdate, max(date) as enddate, max(sequence)
from (select s.*, sum(grpstart) over (order by date) as grp
      from (select s.*,
                   (case when lag(sequence) over (order by date) = sequence - 1
                         then 0 else 1
                    end) as grpstart
            from sequences s
           ) s
     ) s
group by grp;

注意:这假定date列实际存储为日期或合理格式,例如YYYY-MM-DD。如果没有,您需要使用to_date()转换为合理的值。或者,您可能有某种ID或创建日期列可以用于相同的目的。