我需要一种基于先前行分组数据的方法

时间:2017-11-28 19:17:16

标签: sql sql-server gaps-and-islands

让我再试一次。

此表记录了每个人每月的每一天。表中有大约20个字段。如果任何字段发生更改(日期字段除外),那么我想对这些记录进行分组。因此,例如,如果第1天,第2天和第1天3是相同的,然后当我在第4天阅读并注意到它被改变时,我想分组第1,2和2天3与第一天的结果,第3天的结束......等等。

Rownum  ID BegDate  EndDate   Field1, Field2.... Field20
 1      1  6/1/2017 6/1/2017  xxxx    xxxx        xxxxx
 2      1  6/2/2017 6/2/2017  xxxx    xxxx        xxxxx
 3      1  6/3/2017 6/3/2017  xxxx    xxxx        xxxxx
 4      1  6/4/2017 6/4/2017  yyyy    yyyy        yyyy
 5      1  6/5/2017 6/5/2017  yyyy    yyyy        yyyy
 6      1  6/6/2017 6/6/2017  xxxx    xxxx        xxxxx
 7      1  6/7/2017 6/7/2017  xxxx    xxxx        xxxxx
 8      1  6/8/2017 6/8/2017  zzzz    zzzz        zzzz
....

所以在上面的示例数据中,我会有一个包含行1,2,3的组,然后是一个包含行4,5的组,然后是一个包含行6,7的组,然后是一个包含8 ...等的组

ID  BegDate    EndDate  Field1  Field2 ...... Field20   Sum
1   6/1/2017   6/3/2017  xxxx    xxxx          xxxxx      3
1   6/4/2017   6/5/2017  yyyy    yyyy          yyyy       2
1   6/6/2017   6/7/2017  xxxx    xxxx          xxxxx      2
1   6/8/2017   6/15/2017 zzzz    zzzz          zzzz       8
.....

3 个答案:

答案 0 :(得分:0)

例如。创建表:

 create table t
 (date_ datetime,
  status varchar(1));

并添加数据

  insert into t values ('2017-11-01','A');
  insert into t values ('2017-11-02','A');
  insert into t values ('2017-11-03','A');
  insert into t values ('2017-11-04','B');
  insert into t values ('2017-11-05','B');
  insert into t values ('2017-11-06','B');
  insert into t values ('2017-11-07','C');
  insert into t values ('2017-11-08','C');
  insert into t values ('2017-11-09','C');
  insert into t values ('2017-11-10','C');
  insert into t values ('2017-11-11','B');
  insert into t values ('2017-11-12','B');
  insert into t values ('2017-11-13','B');
  insert into t values ('2017-11-14','B');
  insert into t values ('2017-11-15','B');

并使用此查询

 select min(date_start),
           IFNULL(date_end,now()),
           status
    from 
    ( select 
      t1.date_ date_start,
      (select min(date_) from t t2 where t2.date_>t1.date_ and t2.status<>t1.status) - interval 1 day as 'date_end',
     t1.status status
    from t t1
      ) a
      group by date_end,status
      order by 1

http://sqlfiddle.com/#!9/96e27/11

答案 1 :(得分:0)

您可以使用不同的行号来执行此操作:

select ID, min(BegDate) as Begdate, max(EndDate) as max(EndDate),
       Field1, Field2, ...... Field20,
       datediff(day, min(BegDate), max(EndDate))
from (select t.*,
             row_number() over (partition by id order by begdate) as seqnum,
             row_number() over (partition by id, Field1, Field2, . . ., Field20 order by begdate) as seqnum_2
      from t
     ) t
group by id, (seqnum - seqnum_2), Field1, Field2, . . . Field20 ;

答案 2 :(得分:0)

尝试以下查询(带有2个额外字段 - field1和field2)。 为了处理你20个字段,你可以看到field1,field2 with field1,field2,field3,...... field20

    create table #tmp (RowNum int, id int,begdate datetime,EndDate datetime, field1 varchar(10),field2 varchar(10))

    insert into #tmp values(1,1,'2017-06-01','2017-06-01','xxxxx','xxxxx')
    insert into #tmp values(2,1,'2017-06-02','2017-06-02','xxxxx','xxxxx')
    insert into #tmp values(3,1,'2017-06-03','2017-06-03','xxxxx','xxxxx')
    insert into #tmp values(4,1,'2017-06-04','2017-06-04','yyyyy','yyyyy')
    insert into #tmp values(5,1,'2017-06-05','2017-06-05','yyyyy','yyyyy')
    insert into #tmp values(6,1,'2017-06-06','2017-06-06','xxxxx','xxxxx')
    insert into #tmp values(7,1,'2017-06-07','2017-06-07','xxxxx','xxxxx')
    insert into #tmp values(8,1,'2017-06-08','2017-06-08','zzzzz','zzzzz')
    insert into #tmp values(9,1,'2017-06-09','2017-06-09','zzzzz','zzzzz')
    insert into #tmp values(10,1,'2017-06-10','2017-06-10','zzzzz','zzzzz')

    insert into #tmp values(11,2,'2017-06-04','2017-06-04','yyyyy','yyyyy')
    insert into #tmp values(12,2,'2017-06-05','2017-06-05','yyyyy','yyyyy')
    insert into #tmp values(13,2,'2017-06-06','2017-06-06','xxxxx','xxxxx')
    insert into #tmp values(14,2,'2017-06-07','2017-06-07','xxxxx','xxxxx')


    insert into #tmp values(15,1,'2017-06-11','2017-06-11','xxxxx','xxxxx')
    insert into #tmp values(16,1,'2017-06-12','2017-06-12','xxxxx','xxxxx')
    insert into #tmp values(17,1,'2017-06-13','2017-06-13','zzzzz','xxxxx')
    insert into #tmp values(18,1,'2017-06-14','2017-06-14','zzzzz','xxxxx')
    insert into #tmp values(19,1,'2017-06-15','2017-06-15','yyyyy','xxxxx')
    insert into #tmp values(20,1,'2017-06-16','2017-06-16','zzzzz','xxxxx')


    select ID, min(BegDate) as Begdate, max(EndDate) as EndDate,
           Field1,Field2, /*Add all other fields here*/
           datediff(day, min(BegDate), max(EndDate))+1 As [Sum]
    from(
    select *,
                 row_number() over (partition by id order by begdate) as seqnum,
                 row_number() over (partition by id, Field1,field2 /*Add all other fields here*/ order by begdate) as seqnum_2
          from #tmp

        ) t
    group by id, (seqnum - seqnum_2), Field1,Field2 /*Add all other fields here*/
    order by ID,Begdate


    Drop table #tmp