检索每个组的第一个和最后一个记录

时间:2017-03-13 13:37:26

标签: sql google-bigquery

点之间的距离:

我有一组每辆车行驶的GPS点数。我正在尝试检索每次旅行的第一个和最后一个记录。

数据:

  VehicleId       TripId          Latitude            Longitude
    121             131             33.645              -84.424
    121             131             33.452              -84.409
    121             131             33.635              -84.424
    121             131             35.717              -85.121
    121             131             35.111              -85.111

在上面的数据集中,我需要将结果集作为每次旅行的第一个和最后一个点。

  VehicleId       TripId          StartLat            StartLong          EndLat          EndLong
    121             131             33.645              -84.424         35.111          -85.111

我尝试使用以下查询,但是我收到错误"不支持引用其他表的相关子查询,除非它们可以解相关, 例如通过将它们转换为有效的JOIN"任何帮助将不胜感激。

    SELECT
      a.VehicleId,
      a.Tripid,
      a.Latitude AS StartLat,
      a.Longitude AS StartLong,
      b.Latitude AS EndLat,
      b.Longitude AS EndLong,
      a.DateTime
    FROMQ
      `Vehicles` AS a
    JOIN
      `Vehicles` AS b
    ON
      a.VehicleId = b.VehicleId
      AND a.Tripid = b.Tripid
    WHERE
      a.DateTime IN (
      SELECT
        MIN(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)
      AND b.DateTime IN (
      SELECT
        MAX(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)

3 个答案:

答案 0 :(得分:1)

首先想到的是row_number()

select v.*
from (select v.*,
             row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
             row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
      from vehicles v
     ) v
where seqnum_asc = 1 or seqnum_desc = 1;

如果你想要他们在同一行:

select vehicleid, tripid,
       min(datetime) as start_datetime, max(datetime) as end_datetime,
       min(case when seqnum_asc = 1 then latitude end) as start_latitude,
       min(case when seqnum_asc = 1 then longitude end) as start_longitude,
       min(case when seqnum_desc = 1 then latitude end) as end_latitude,
       min(case when seqnum_desc = 1 then longitude end) as end_longitude
from (select v.*,
             row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
             row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
      from vehicles v
     ) v
where seqnum_asc = 1 or seqnum_desc = 1
group by vehicleid, tripid;

答案 1 :(得分:1)

这是使用聚合函数的另一个选项:

#standardSQL
WITH Vehicles AS (
 SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
 SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
 SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
 SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
 SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
)
SELECT
  VehicleId,
  TripId,
  ARRAY_AGG(STRUCT(Latitude, Longitude)
            ORDER BY DateTime ASC LIMIT 1)[OFFSET(0)] AS start_location,
  ARRAY_AGG(STRUCT(Latitude, Longitude)
            ORDER BY DateTime DESC LIMIT 1)[OFFSET(0)] AS end_location
FROM Vehicles
GROUP BY
  VehicleId,
  TripId;

答案 2 :(得分:1)

使用SQL 2012,您也可以使用

SELECT DISTINCT VehicleId, TripId,
    FIRST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLatitude,
    LAST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLatitude,
    FIRST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLongitude,
    LAST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLongitude
FROM    dbo.Vehicles