我试图将一个数据集放在一起为GIS应用程序提供数据,并寻找一个SQL Guru来帮助我避免使用游标和临时表,因为我们有大约3000万条记录要处理(I' m使用SQL Server 2012)。数据集的要点包括VehicleID,PositionTime,纬度,经度和速度。我们需要通过VehicleID和PositionTime排序记录,以便映射Lat / Long点和跟踪运动。如果速度大于或等于0.2,那么它是可接受的记录。但是,如果速度小于0.2,则认为它不移动并需要特殊处理。在下面的示例中,对于记录一和二,我需要平均纬度,长度和速度,并保持最大位置时间并将其放入单个记录中。第三和第四条记录都很好。记录5和6也需要合并为一个记录,因此车辆1将从6个记录变为4个记录。在记录"标准化"之后,我们需要获得每个车辆的记录和位置时间之间的日期差异。对于VehicleID 2,我们将重新开始相同的过程(我们有大约7,000辆车)。
VehicleID PositionTime Latitude Longitude Speed 1 11/20/2013 18:09:27 29.54608 -95.04444 0.1 1 11/20/2013 18:47:35 29.54608 -95.04444 0 1 11/20/2013 20:34:45 29.546105 -95.04442 5 1 11/20/2013 20:46:44 29.54607833 -95.04443167 3 1 11/20/2013 21:01:44 29.54606667 -95.04442833 0 1 11/20/2013 21:16:43 29.546095 -95.04443167 0.1 2 11/20/2013 21:31:44 29.54609 -95.04441 5 2 11/20/2013 21:46:44 29.54607667 -95.04443 0
答案 0 :(得分:1)
我认为你要做的就是聚合缓慢/不动的车辆
这有帮助吗?
Create or replace view vehicles_view as
select VehicleId,
Max(PositionTime) as PositionTime,
avg(Latitude) as Latitude,
Avg(Longitude) as Longitude,
avg(Speed) as speed
from vehicles
where Speed < 0.2
group by vehicleId
union
select VehicleId,
PositionTime ,
Latitude,
Longitude,
Speed
from vehicles
where Speed >= 0.2
答案 1 :(得分:1)
好的,这是一些代码:
DECLARE @t TABLE
(
VehicleID INT ,
PositionTime DATETIME ,
Latitude DECIMAL(20, 10) ,
Longitude DECIMAL(20, 10) ,
Speed DECIMAL(20, 10)
)
INSERT INTO @t
VALUES ( 1, '11/20/2013 18:09:27', 29.54608, -95.04444, 0.1 ),
( 1, '11/20/2013 18:47:35', 29.54608, -95.04444, 0 ),
( 1, '11/20/2013 20:34:45', 29.546105, -95.04442, 5 ),
( 1, '11/20/2013 20:46:44', 29.54607833, -95.04443167, 3 ),
( 1, '11/20/2013 21:01:44', 29.54606667, -95.04442833, 0 ),
( 1, '11/20/2013 21:16:43', 29.546095, -95.04443167, 0.1 ),
( 2, '11/20/2013 21:31:44', 29.54609, -95.04441, 5 ),
( 2, '11/20/2013 21:46:44', 29.54607667, -95.04443, 0 );
WITH cte1
AS ( SELECT VehicleID ,
PositionTime ,
Latitude ,
Longitude ,
CASE WHEN Speed <= 0.2 THEN 0
ELSE Speed
END AS Speed
FROM @t
),
cte2
AS ( SELECT * ,
SUM(Speed) OVER ( PARTITION BY VehicleID ORDER BY PositionTime ) AS s
FROM cte1
),
cte3
AS ( SELECT * ,
RANK() OVER ( PARTITION BY VehicleID ORDER BY Speed, s ) AS r
FROM cte2
),
cte4
AS ( SELECT VehicleID ,
MAX(PositionTime) AS PositionTime ,
AVG(Latitude) AS Latitude ,
AVG(Longitude) AS Longitude ,
MAX(Speed) AS Speed
FROM cte3
GROUP BY VehicleID ,
r
)
SELECT * ,
DATEDIFF(ss,
LAG(PositionTime) OVER ( PARTITION BY VehicleID ORDER BY PositionTime ),
PositionTime) AS DiffInSeconds
FROM cte4
ORDER BY PositionTime
输出:
VehicleID PositionTime Latitude Longitude Speed DiffInSeconds
1 2013-11-20 18:47:35.000 29.5460800000 -95.0444400000 0.0000000000 NULL
1 2013-11-20 20:34:45.000 29.5461050000 -95.0444200000 5.0000000000 6430
1 2013-11-20 20:46:44.000 29.5460783300 -95.0444316700 3.0000000000 719
1 2013-11-20 21:16:43.000 29.5460808350 -95.0444300000 0.0000000000 1799
2 2013-11-20 21:31:44.000 29.5460900000 -95.0444100000 5.0000000000 NULL
2 2013-11-20 21:46:44.000 29.5460766700 -95.0444300000 0.0000000000 900
让我们来看看吧。
在cte1
中,我只是将Speed <=0.2
的所有值都转换为0
在cte2
我汇总了Speed
个值,因此我在Speed = 0
Speed s
0.0000000000 0.0000000000
0.0000000000 0.0000000000
5.0000000000 5.0000000000
3.0000000000 8.0000000000
0.0000000000 8.0000000000
0.0000000000 8.0000000000
5.0000000000 5.0000000000
0.0000000000 5.0000000000
然后在cte3
我将排名功能应用于Speed
和增量总和s
的组合。所以我得到了:
VehicleID Speed s r
1 0.0000000000 0.0000000000 1
1 0.0000000000 0.0000000000 1
1 0.0000000000 8.0000000000 3
1 0.0000000000 8.0000000000 3
1 3.0000000000 8.0000000000 5
1 5.0000000000 5.0000000000 6
2 0.0000000000 5.0000000000 1
2 5.0000000000 5.0000000000 2
请注意,r
的{{1}}为Speed
,因此您可以VehicleID
和r
在此处应用分组。我在cte4
中执行该操作以获取不同的行,并在日期和Max
上应用汇总,如lat和long。{/ p>
最后,从cte4中选择了使用Avg
函数计算当前行与行之间的差异。