聚合SQL表中的连续行

时间:2016-06-27 09:49:13

标签: sql sql-server

我正在尝试在以下SQL表上运行聚合函数来总结所有" LengthOfRecord"按" Long + Lat"分组并且只有连续的行(即" RowNumber"处于运行顺序中)。

+-----------+-----------+---------------+----------------+
| RowNumber | Vessel ID |   Long+Lat    | LengthOfRecord |
+-----------+-----------+---------------+----------------+
| 102313179 | Vessel 01 | 123.751 1.196 |            181 |
| 102313180 | Vessel 01 | 123.751 1.196 |            179 |
| 102313181 | Vessel 01 | 123.751 1.196 |            361 |
| 102313182 | Vessel 01 | 123.751 1.196 |            359 |
| 102313183 | Vessel 01 | 123.751 1.196 |            180 |
| 102313184 | Vessel 01 | 123.751 1.196 |            181 |
| 102313185 | Vessel 01 | 123.751 1.196 |            179 |
| 102313186 | Vessel 01 | 123.751 1.196 |            180 |
| 102313187 | Vessel 01 | 123.751 1.196 |            360 |
| 102313188 | Vessel 01 | 123.751 1.196 |            360 |
| 102313189 | Vessel 01 | 123.751 1.196 |            180 |
| 102313191 | Vessel 01 | 123.751 1.196 |            181 |
| 102313298 | Vessel 01 | 123.750 1.197 |            180 |
| 102313375 | Vessel 01 | 123.742 1.196 |            179 |
| 102313376 | Vessel 01 | 123.742 1.196 |            359 |
| 102313377 | Vessel 01 | 123.742 1.196 |            180 |
| 102313379 | Vessel 01 | 123.742 1.196 |            181 |
| 102313380 | Vessel 01 | 123.742 1.196 |            178 |
+-----------+-----------+---------------+----------------+

以下是我试图通过SQL语句实现的结果。无论如何我可以通过SQL查询来做到这一点吗?

+-----------+---------------+----------------+
| Vessel ID |   Long+Lat    | LengthOfRecord |
+-----------+---------------+----------------+
| Vessel 01 | 123.751 1.196 |           2881 |
| Vessel 01 | 123.750 1.197 |            180 |
| Vessel 01 | 123.742 1.196 |           1077 |
+-----------+---------------+----------------+

2 个答案:

答案 0 :(得分:4)

您可以使用行号方法的差异来执行此操作:

select vesselId, latLong, sum(lengthOfRecord)
from (select t.*,
             row_number() over (partition by vesselId order by rowNumber) as seqnum,
             row_number() over (partition by vesselId, latlong order by rowNumber) as seqnum_latlong
      from table t
     ) t
group by (seqnum  - seqnum_latlong), latLong, vesselId;

行号方法的差异有点难以解释。它标识具有相同值的相邻行。如果运行子查询,您将看到计算的工作原理。

答案 1 :(得分:1)

这可能很长,但希望以相对可读的方式满足您的要求:

declare @t table (RowNumber int not null, VesselID varchar(17) not null,
                  LatLong varchar(19),LengthOfRecord int not null)
insert into @t(RowNumber,VesselID,LatLong,LengthOfRecord) values
(102313179,'Vessel 01','123.751 1.196',181),
(102313180,'Vessel 01','123.751 1.196',179),
(102313181,'Vessel 01','123.751 1.196',361),
(102313182,'Vessel 01','123.751 1.196',359),
(102313183,'Vessel 01','123.751 1.196',180),
(102313184,'Vessel 01','123.751 1.196',181),
(102313185,'Vessel 01','123.751 1.196',179),
(102313186,'Vessel 01','123.751 1.196',180),
(102313187,'Vessel 01','123.751 1.196',360),
(102313188,'Vessel 01','123.751 1.196',360),
(102313189,'Vessel 01','123.751 1.196',180),
(102313191,'Vessel 01','123.751 1.196',181),
(102313298,'Vessel 01','123.750 1.197',180),
(102313375,'Vessel 01','123.742 1.195',179),
(102313376,'Vessel 01','123.742 1.195',359),
(102313377,'Vessel 01','123.742 1.195',180),
(102313379,'Vessel 01','123.742 1.195',181),
(102313380,'Vessel 01','123.742 1.195',178)

;With ContiguousRN as (
    select
        *,
        ROW_NUMBER() OVER (PARTITION BY VesselID ORDER BY RowNumber) as rn
    from
        @t
), Starts as (
    select
        r1.VesselID,
        r1.rn,
        r1.LatLong,
        ROW_NUMBER() OVER (PARTITION BY r1.VesselID ORDER BY r1.rn) as srn
    from
        ContiguousRN r1
            left join
        ContiguousRN r2
            on
                r1.rn = r2.rn + 1 and
                r1.VesselID = r2.VesselID and
                r1.LatLong = r2.LatLong
    where
        r2.rn is null
), Ends as (
    select
        r1.VesselID,
        r1.rn,
        r1.LatLong,
        ROW_NUMBER() OVER (PARTITION BY r1.VesselID ORDER BY r1.rn) as srn
    from
        ContiguousRN r1
            left join
        ContiguousRN r2
            on
                r1.rn = r2.rn - 1 and
                r1.VesselID = r2.VesselID and
                r1.LatLong = r2.LatLong
    where
        r2.rn is null
), Sequences as (
    select
        s.VesselID,
        s.LatLong,
        s.rn as StartRow,e.rn as EndRow
    from
        Starts s
            inner join
        Ends e
            on
                s.VesselID = e.VesselID and
                s.srn = e.srn
)
select
    seq.VesselID,
    seq.LatLong,
    (select SUM(LengthOfRecord) from ContiguousRN r
    where r.VesselID = seq.VesselID and
    r.rn between seq.StartRow and seq.EndRow) as LengthOfRecord
from Sequences seq

我已经更改了一些列名,因此我不必继续引用它们,因为它们包含空格或标点符号。我还建议您将该位置存储在真正的geography - 类型列中,或者将lat和long存储在单独的列中。

那么,上面的查询。第一个CTE(ContiguousRN)只是为我们安排了行号(rn),与RowNumber不同,它们没有间隙。第二个和第三个查询定位表中的行,这些行是每次运行的开始和结束 - 基本上,定位前一行或后一行具有不同LatLong值的行。我们还为这些行生成一系列单独的行号,因此,在Sequences中,我们可以将每个起始行与其对应的结束行组合。

最后,在最后一个select中,我们将它们组合在一起,我们总计了位于每个开始和结束标记之间的所有行。

我始终认为VesselID应该用作某种形式的分区值,并且您的实际数据可能包含多个容器的详细信息,而且此过程不应该将数据混合在一起。如果情况并非如此,您可以删除上述VesselID周围的大部分条件。

结果:

VesselID          LatLong             LengthOfRecord
----------------- ------------------- --------------
Vessel 01         123.751 1.196       2881
Vessel 01         123.750 1.197       180
Vessel 01         123.742 1.195       1077