冷凝相似的行在组中发生并保持秩序

时间:2012-06-13 23:43:27

标签: sql-server tsql sql-server-2008-r2

我有一个包含设备gps坐标的sql表,每隔 n 分钟更新一次(设备安装在车辆中)。鉴于GPS的性质,许多条目非常相似,但就服务器而言则完全不同。我可以使用CAST(lat as decimal(7,4))

轻松匹配事物(在~3.6'或36'之内)

我希望能够获取结果集并压缩大致重复的条目,但仍然保持基于时间的顺序。这是一个例子:

Row    Lat         Lng        vel Hdg Time
01    31.12345    -88.12345   00  00  12-4-21 01:45:00
02    31.12346    -88.12345   00  00  12-4-21 01:46:00
03    31.12455    -88.12410   10  01  12-4-21 01:47:00
04    31.12495    -88.12480   17  01  12-4-21 01:48:00
05    31.12532    -88.12560   22  01  12-4-21 01:49:00
06    31.12567    -88.12608   25  02  12-4-21 01:50:00
07    31.12638    -88.12672   24  02  12-4-21 01:51:00
08    31.12689    -88.12722   19  02  12-4-21 01:52:00
09    31.12345    -88.12345   00  00  12-4-21 01:53:00
10    31.12346    -88.12346   00  00  12-4-21 01:54:00
11    31.12347    -88.12345   00  00  12-4-21 01:55:00
12    31.12346    -88.12346   00  00  12-4-21 01:56:00
13    31.12689    -88.12788   10  40  12-4-21 01:57:00
14    31.12604    -88.12691   13  39  12-4-21 01:58:00
15    31.12572    -88.12603   15  39  12-4-21 01:59:00

我想要的最终结果是将第1行和第2行压缩为单行,第9行到第12行压缩为一行,包含AVG(Lat)AVG(Lng)和{{1 }}

鉴于上述数据,这是我想收到的结果集:

MIN(Time)

分组之间的界限将是运动。速度> 0或gps坐标变化超过 x 数量。在这种情况下, x 是.0001。如下所述,问题是将给定坐标处的多个停靠点(AT不同时间点)集中到一个停靠点中。如果我今天下午4点,明天早上8点,然后又是下午6点访问坐标x ,我看到的唯一一个是明天@ 6 pm(在{{1}的情况下) })或今天@ 4 pm(在Row Lat Lng vel Hdg Time 01 31.123455 -88.12345 00 00 12-4-21 01:45:00 02 31.12455 -88.12410 10 01 12-4-21 01:47:00 03 31.12495 -88.12480 17 01 12-4-21 01:48:00 04 31.12532 -88.12560 22 01 12-4-21 01:49:00 05 31.12567 -88.12608 25 02 12-4-21 01:50:00 06 31.12638 -88.12672 24 02 12-4-21 01:51:00 07 31.12689 -88.12722 19 02 12-4-21 01:52:00 08 31.12346 -88.123455 00 00 12-4-21 01:53:00 09 31.12689 -88.12788 10 40 12-4-21 01:57:00 10 31.12604 -88.12691 13 39 12-4-21 01:58:00 11 31.12572 -88.12603 15 39 12-4-21 01:59:00 的情况下)。

如果速度为0,则航向也为0.但是,如果行1和2以及9到12的坐标相似且足够相同(即四舍五入到小数点后4位。)

我有一个查询就是这样:

MAX(Time)

换句话说,如果我从A点旅行到B点,请停留30分钟(30分钟,每分钟1次),然后前往C点,停留20分钟,然后返回B点并停留在前往D点前20分钟,我希望能够在B点看到两个单独的站点。

这是我的数据库中的一些实际数据,为保护无辜者而进行了消毒,或者在阿拉巴马州东北部责怪某人。

MIN(Time)

你会注意到第4行和最后一行分别有527和168个条目,它们跨越2天。这些条目仅来自1个设备,并且是设备在多个场合多次在同一地点停止的地方。

以下是一些压缩的csv数据:sample

我最终做了什么

对Aaron Bertrand提供的查询进行了一些小修改,如下所示:

SELECT Geography::Point(AVG(dbo.GPSEntries.Latitude), 
                        AVG(dbo.GPSEntries.Longitude),
                        4326 ) as Location,
       dbo.GPSEntries.Velocity,
       dbo.GPSEntries.Heading,
       MAX(dbo.GPSEntries.Time) as maxTime,
       MIN(dbo.GPSEntries.Time) as minTime,
       AVG(dbo.RFDatas.RSSI) as avgRSSI,
       COUNT(1) as samples

FROM dbo.GPSEntries
     INNER JOIN
         dbo.Reports ON
             dbo.GPSEntries.Report_Id = dbo.Reports.Id 
     INNER JOIN
         dbo.RFDatas ON
             dbo.GPSEntries.Report_Id = dbo.RFDatas.Report_Id

GROUP BY CAST(Latitude as Decimal(7,4)),
         CAST(Longitude as Decimal(7,4)),
         Velocity,
         Heading

ORDER BY MAX(Time)

1 个答案:

答案 0 :(得分:1)

以下是tempdb中的一些示例数据:

USE tempdb;
GO

CREATE TABLE dbo.GPSEntries
( 
  Latitude DECIMAL(8,5), 
  Longitude DECIMAL(8,5), 
  Velocity TINYINT, 
  Heading TINYINT, 
  [Time] SMALLDATETIME
);

INSERT dbo.GPSEntries VALUES
 (31.12345,-88.12345,00,00,'2012-04-21 01:45:00'),
 (31.12346,-88.12345,00,00,'2012-04-21 01:46:00'),
 (31.12455,-88.12410,10,01,'2012-04-21 01:47:00'),
 (31.12495,-88.12480,17,01,'2012-04-21 01:48:00'),
 (31.12532,-88.12560,22,01,'2012-04-21 01:49:00'),
 (31.12567,-88.12608,25,02,'2012-04-21 01:50:00'),
 (31.12638,-88.12672,24,02,'2012-04-21 01:51:00'),
 (31.12689,-88.12722,19,02,'2012-04-21 01:52:00'),
 (31.12345,-88.12345,00,00,'2012-04-21 01:53:00'),
 (31.12346,-88.12346,00,00,'2012-04-21 01:54:00'),
 (31.12347,-88.12345,00,00,'2012-04-21 01:55:00'),
 (31.12346,-88.12346,00,00,'2012-04-21 01:56:00'),
 (31.12689,-88.12788,10,40,'2012-04-21 01:57:00'),
 (31.12604,-88.12691,13,39,'2012-04-21 01:58:00'),
 (31.12572,-88.12603,15,39,'2012-04-21 01:59:00');

我尝试满足查询:

;WITH d AS
(
    SELECT Time, Latitude, Longitude, Velocity, Heading, 
        NormLat = CONVERT(DECIMAL(7,4), Latitude), 
        NormLong = CONVERT(DECIMAL(7,4), Longitude),
        TimeRN = ROW_NUMBER() OVER (ORDER BY [Time])
    FROM dbo.GPSEntries
    -- /* you probably want filters:
    -- WHERE DeviceID = @SomeDeviceID
    -- AND [Time] >= @SomeStartDate
    -- AND [Time] <  DATEADD(DAY, 1, @SomeEndDate)
    -- /* also your sample CSV file had lots of duplicates, so:
    GROUP BY Time, Latitude, Longitude, Velocity, Heading
),
y AS (
  SELECT MinTime = MIN(Time), MaxTime = MAX(Time), Latitude = AVG(Latitude), 
    Longitude = AVG(Longitude), [RowCount] = COUNT(*) FROM 
    (
      SELECT Time, Latitude, Longitude, GroupNumber = 
      (
        SELECT MIN(d2.TimeRN) 
         FROM d AS d2 WHERE d2.TimeRN >= d.TimeRN 
         AND NOT EXISTS 
         (
           SELECT 1 FROM d AS d3
           WHERE d2.NormLat = d.NormLat
           AND d2.NormLong = d.NormLong
         )
       )
       FROM d
    ) AS x GROUP BY GroupNumber
)
SELECT [Row] = ROW_NUMBER() OVER (ORDER BY y.MinTime),
  y.Latitude, y.Longitude, d.Velocity, d.Heading, 
  y.MinTime, y.MaxTime, y.[RowCount]
FROM y INNER JOIN d ON y.MinTime = d.[Time]
ORDER BY y.MinTime;

结果:

Row Latitude  Longitude  Velocity Heading MinTime          MaxTime          RowCount
---|---------|----------|--------|-------|----------------|----------------|--------
1   31.123455 -88.123450   0        0     2012-04-21 01:45 2012-04-21 01:46   2
2   31.124550 -88.124100   10       1     2012-04-21 01:47 2012-04-21 01:47   1
3   31.124950 -88.124800   17       1     2012-04-21 01:48 2012-04-21 01:48   1
4   31.125320 -88.125600   22       1     2012-04-21 01:49 2012-04-21 01:49   1
5   31.125670 -88.126080   25       2     2012-04-21 01:50 2012-04-21 01:50   1
6   31.126380 -88.126720   24       2     2012-04-21 01:51 2012-04-21 01:51   1
7   31.126890 -88.127220   19       2     2012-04-21 01:52 2012-04-21 01:52   1
8   31.123460 -88.123455   0        0     2012-04-21 01:53 2012-04-21 01:56   4
9   31.126890 -88.127880   10       40    2012-04-21 01:57 2012-04-21 01:57   1
10  31.126040 -88.126910   13       39    2012-04-21 01:58 2012-04-21 01:58   1
11  31.125720 -88.126030   15       39    2012-04-21 01:59 2012-04-21 01:59   1