我正在尝试在以下SQL表上运行聚合函数来总结所有" LengthOfRecord"按" Long + Lat"分组并且只有连续的行(即" RowNumber"处于运行顺序中)。
+-----------+-----------+---------------+----------------+
| RowNumber | Vessel ID | Long+Lat | LengthOfRecord |
+-----------+-----------+---------------+----------------+
| 102313179 | Vessel 01 | 123.751 1.196 | 181 |
| 102313180 | Vessel 01 | 123.751 1.196 | 179 |
| 102313181 | Vessel 01 | 123.751 1.196 | 361 |
| 102313182 | Vessel 01 | 123.751 1.196 | 359 |
| 102313183 | Vessel 01 | 123.751 1.196 | 180 |
| 102313184 | Vessel 01 | 123.751 1.196 | 181 |
| 102313185 | Vessel 01 | 123.751 1.196 | 179 |
| 102313186 | Vessel 01 | 123.751 1.196 | 180 |
| 102313187 | Vessel 01 | 123.751 1.196 | 360 |
| 102313188 | Vessel 01 | 123.751 1.196 | 360 |
| 102313189 | Vessel 01 | 123.751 1.196 | 180 |
| 102313191 | Vessel 01 | 123.751 1.196 | 181 |
| 102313298 | Vessel 01 | 123.750 1.197 | 180 |
| 102313375 | Vessel 01 | 123.742 1.196 | 179 |
| 102313376 | Vessel 01 | 123.742 1.196 | 359 |
| 102313377 | Vessel 01 | 123.742 1.196 | 180 |
| 102313379 | Vessel 01 | 123.742 1.196 | 181 |
| 102313380 | Vessel 01 | 123.742 1.196 | 178 |
+-----------+-----------+---------------+----------------+
以下是我试图通过SQL语句实现的结果。无论如何我可以通过SQL查询来做到这一点吗?
+-----------+---------------+----------------+
| Vessel ID | Long+Lat | LengthOfRecord |
+-----------+---------------+----------------+
| Vessel 01 | 123.751 1.196 | 2881 |
| Vessel 01 | 123.750 1.197 | 180 |
| Vessel 01 | 123.742 1.196 | 1077 |
+-----------+---------------+----------------+
答案 0 :(得分:4)
您可以使用行号方法的差异来执行此操作:
select vesselId, latLong, sum(lengthOfRecord)
from (select t.*,
row_number() over (partition by vesselId order by rowNumber) as seqnum,
row_number() over (partition by vesselId, latlong order by rowNumber) as seqnum_latlong
from table t
) t
group by (seqnum - seqnum_latlong), latLong, vesselId;
行号方法的差异有点难以解释。它标识具有相同值的相邻行。如果运行子查询,您将看到计算的工作原理。
答案 1 :(得分:1)
这可能很长,但希望以相对可读的方式满足您的要求:
declare @t table (RowNumber int not null, VesselID varchar(17) not null,
LatLong varchar(19),LengthOfRecord int not null)
insert into @t(RowNumber,VesselID,LatLong,LengthOfRecord) values
(102313179,'Vessel 01','123.751 1.196',181),
(102313180,'Vessel 01','123.751 1.196',179),
(102313181,'Vessel 01','123.751 1.196',361),
(102313182,'Vessel 01','123.751 1.196',359),
(102313183,'Vessel 01','123.751 1.196',180),
(102313184,'Vessel 01','123.751 1.196',181),
(102313185,'Vessel 01','123.751 1.196',179),
(102313186,'Vessel 01','123.751 1.196',180),
(102313187,'Vessel 01','123.751 1.196',360),
(102313188,'Vessel 01','123.751 1.196',360),
(102313189,'Vessel 01','123.751 1.196',180),
(102313191,'Vessel 01','123.751 1.196',181),
(102313298,'Vessel 01','123.750 1.197',180),
(102313375,'Vessel 01','123.742 1.195',179),
(102313376,'Vessel 01','123.742 1.195',359),
(102313377,'Vessel 01','123.742 1.195',180),
(102313379,'Vessel 01','123.742 1.195',181),
(102313380,'Vessel 01','123.742 1.195',178)
;With ContiguousRN as (
select
*,
ROW_NUMBER() OVER (PARTITION BY VesselID ORDER BY RowNumber) as rn
from
@t
), Starts as (
select
r1.VesselID,
r1.rn,
r1.LatLong,
ROW_NUMBER() OVER (PARTITION BY r1.VesselID ORDER BY r1.rn) as srn
from
ContiguousRN r1
left join
ContiguousRN r2
on
r1.rn = r2.rn + 1 and
r1.VesselID = r2.VesselID and
r1.LatLong = r2.LatLong
where
r2.rn is null
), Ends as (
select
r1.VesselID,
r1.rn,
r1.LatLong,
ROW_NUMBER() OVER (PARTITION BY r1.VesselID ORDER BY r1.rn) as srn
from
ContiguousRN r1
left join
ContiguousRN r2
on
r1.rn = r2.rn - 1 and
r1.VesselID = r2.VesselID and
r1.LatLong = r2.LatLong
where
r2.rn is null
), Sequences as (
select
s.VesselID,
s.LatLong,
s.rn as StartRow,e.rn as EndRow
from
Starts s
inner join
Ends e
on
s.VesselID = e.VesselID and
s.srn = e.srn
)
select
seq.VesselID,
seq.LatLong,
(select SUM(LengthOfRecord) from ContiguousRN r
where r.VesselID = seq.VesselID and
r.rn between seq.StartRow and seq.EndRow) as LengthOfRecord
from Sequences seq
我已经更改了一些列名,因此我不必继续引用它们,因为它们包含空格或标点符号。我还建议您将该位置存储在真正的geography
- 类型列中,或者将lat和long存储在单独的列中。
那么,上面的查询。第一个CTE(ContiguousRN)只是为我们安排了行号(rn
),与RowNumber
不同,它们没有间隙。第二个和第三个查询定位表中的行,这些行是每次运行的开始和结束 - 基本上,定位前一行或后一行具有不同LatLong
值的行。我们还为这些行生成一系列单独的行号,因此,在Sequences
中,我们可以将每个起始行与其对应的结束行组合。
最后,在最后一个select
中,我们将它们组合在一起,我们总计了位于每个开始和结束标记之间的所有行。
我始终认为VesselID
应该用作某种形式的分区值,并且您的实际数据可能包含多个容器的详细信息,而且此过程不应该将数据混合在一起。如果情况并非如此,您可以删除上述VesselID
周围的大部分条件。
结果:
VesselID LatLong LengthOfRecord
----------------- ------------------- --------------
Vessel 01 123.751 1.196 2881
Vessel 01 123.750 1.197 180
Vessel 01 123.742 1.195 1077