如何计算表中的连续重复项?

时间:2016-05-30 09:05:59

标签: sql oracle

我有以下问题:
想要找到连续的重复项

SLNO   NAME     PG   
1       A1      NO                   
2       A2      YES              
3       A3      NO           
4       A4      YES          
6       A5      YES          
7       A6      YES          
8       A7      YES      
9       A8      YES  
10      A9      YES
11      A10     NO 
12      A11     YES 
13      A12     NO 
14      A14     NO

我们将考虑PG列的值,我需要输出为6,这是最大连续重复次数。

6 个答案:

答案 0 :(得分:4)

可以用Tabibitosan方法完成。运行它,了解它:

JdbcResourceLocalTransactionCoordinatorImpl.TransactionDriverControlImpl

Newgrp表示找到了一个新组。

结果:

with a as(
select 1 slno, 'A' pg from dual union all
select 2 slno, 'A' pg from dual union all
select 3 slno, 'B' pg from dual union all
select 4 slno, 'A' pg from dual union all
select 5 slno, 'A' pg from dual union all
select 6 slno, 'A' pg from dual 
)
select slno, pg, newgrp, sum(newgrp) over (order by slno) grp
from( 
    select slno, 
           pg, 
           case when pg <> nvl(lag(pg) over (order by slno),1) then 1 else 0 end newgrp
    from a
    );

现在,只需使用带有计数的分组,即可找到具有最大出现次数的组:

SLNO PG NEWGRP GRP
1    A  1      1
2    A  0      1
3    B  1      2
4    A  1      3
5    A  0      3
6    A  0      3

答案 1 :(得分:1)

with test as (
select 1 slno,'A1' name ,'NO' pg from dual union all 
select 2,'A2','YES' from dual union all
select 3,'A3','NO' from dual union all
select 4,'A4','YES' from dual union all
select 6,'A5','YES' from dual union all
select 7,'A6','YES' from dual union all
select 8,'A7','YES' from dual union all
select 9,'A8','YES' from dual union all
select 10,'A9','YES' from dual union all
select 11,'A10','NO' from dual union all
select 12,'A11','YES' from dual union all
select 13,'A12','NO' from dual union all
select 14,'A14','NO' from dual),
consecutive as (select row_number() over(order by slno) rr, x.* 
              from test x)
select x.* from Consecutive x
  left join Consecutive y on x.rr = y.rr+1 and x.pg = y.pg
  where y.rr is not null
  order by x.slno 

你可以用where条件控制输出。

where y.rr is not null查询返回重复项

where y.rr is null查询返回“不同”值。

答案 2 :(得分:1)

为了完整起见,这是实际的Tabibitosan方法:

with sample_data as (select 1 slno, 'A1' name, 'NO' pg from dual union all 
                     select 2 slno, 'A2' name, 'YES' pg from dual union all
                     select 3 slno, 'A3' name, 'NO' pg from dual union all
                     select 4 slno, 'A4' name, 'YES' pg from dual union all
                     select 6 slno, 'A5' name, 'YES' pg from dual union all
                     select 7 slno, 'A6' name, 'YES' pg from dual union all
                     select 8 slno, 'A7' name, 'YES' pg from dual union all
                     select 9 slno, 'A8' name, 'YES' pg from dual union all
                     select 10 slno, 'A9' name, 'YES' pg from dual union all
                     select 11 slno, 'A10' name, 'NO' pg from dual union all
                     select 12 slno, 'A11' name, 'YES' pg from dual union all
                     select 13 slno, 'A12' name, 'NO' pg from dual union all
                     select 14 slno, 'A14' name, 'NO' pg from dual)
-- end of mimicking a table called "sample_data" containing your data; see SQL below:
select max(cnt) max_pg_in_queue
from   (select   count(*) cnt
        from     (select slno,
                         name,
                         pg,
                         row_number() over (order by slno)
                           - row_number() over (partition by pg
                                                order by slno) grp
                  from   sample_data)
        where    pg = 'YES'
        group by grp);

MAX_PG_IN_QUEUE
---------------
              6

答案 3 :(得分:0)

SELECT MAX(consecutives) -- Block 1
FROM (
    SELECT t1.pg, t1.slno, COUNT(*) AS consecutives -- Block 2
    FROM test t1 INNER JOIN test t2 ON t1.pg = t2.pg
    WHERE t1.slno <= t2.slno
      AND NOT EXISTS (
        SELECT *  -- Block 3
        FROM test t3 
        WHERE t3.slno > t1.slno
          AND t3.slno < t2.slno
          AND t3.pg  != t1.pg
    )    
    GROUP BY t1.pg, t1.slno
);

查询以下列方式计算结果:

  • 提取所有不具有PG不同值的记录的记录(块2和3)
  • PG值对其进行分组并开始SLNO值 - &gt;这会计算任何[PG,(起始)SLNO]对的连续值(块2);
  • 从查询2中提取最大值(块1)

请注意,如果表中的slno字段包含连续值,则可以简化查询,但这似乎不是您的情况(在您的示例记录中SLNO = 5缺失)

答案 4 :(得分:0)

只需要一个聚合查询且没有连接(其余计算可以使用ROW_NUMBERLAGLAST_VALUE完成:

SELECT MAX( num_before_in_queue ) AS max_sequential_in_queue
FROM   (
  SELECT rn - LAST_VALUE( has_changed ) IGNORE NULL OVER ( ORDER BY ROWNUM ) + 1
           AS num_before_in_queue
  FROM   (
    SELECT pg,
           ROW_NUMBER() OVER ( ORDER BY slno ) AS rn,
           CASE pg WHEN LAG( pg ) OVER ( ORDER BY slno )
                   THEN NULL
                   ELSE ROW_NUMBER() OVER ( ORDER BY sl_no )
                   END AS change
    FROM   table_name
  )
  WHERE  pg = 'Y'
);

答案 5 :(得分:-3)

尝试使用row_number()

select
    SLNO,
    Name,
    PG,
    row_number() over (partition by PG order by PG) as 'Consecutive'
from
    <table>
order by
    SLNO,
    NAME,
    PG

这应该适用于小调整。

- 编辑 -

抱歉,PG的分区。 分区告诉row_number何时开始一个新序列。