反向地理编码:如何使用BigQuery SQL确定最接近(纬度和经度)城市?

时间:2018-12-08 00:10:23

标签: sql google-bigquery gis geocoding reverse-geocoding

我收集了很多点-我想确定每个点最近的城市。我该如何使用BigQuery?

2 个答案:

答案 0 :(得分:3)

这是到目前为止我们得出的效果最好的查询:

WITH a AS (
  # a table with points around the world
  SELECT * FROM UNNEST([ST_GEOGPOINT(-70, -33), ST_GEOGPOINT(-122,37), ST_GEOGPOINT(151,-33)]) my_point
), b AS (
  # any table with cities world locations
  SELECT *, ST_GEOGPOINT(lon,lat) latlon_geo
  FROM `fh-bigquery.geocode.201806_geolite2_latlon_redux` 
)

SELECT my_point, city_name, subdivision_1_name, country_name, continent_name
FROM (
  SELECT loc.*, my_point
  FROM (
    SELECT ST_ASTEXT(my_point) my_point, ANY_VALUE(my_point) geop
      , ARRAY_AGG( # get the closest city
           STRUCT(city_name, subdivision_1_name, country_name, continent_name) 
           ORDER BY ST_DISTANCE(my_point, b.latlon_geo) LIMIT 1
        )[SAFE_OFFSET(0)] loc
    FROM a, b 
    WHERE ST_DWITHIN(my_point, b.latlon_geo, 100000)  # filter to only close cities
    GROUP BY my_point
  )
)
GROUP BY 1,2,3,4,5

enter image description here

答案 1 :(得分:0)

我有很多要点...

Felipe的解决方案在许多方面都是完美的,但是我发现,在您确实只有很少的点要搜索最近的城市并且不能将自己限制在解决方案下方60英里的情况下,效果会更好

#standardSQL
WITH a AS (
  # a table with points around the world
  SELECT ST_GEOGPOINT(lon,lat) my_point
  FROM `fh-bigquery.geocode.201806_geolite2_latlon_redux`  
), b AS (
  # any table with cities world locations
  SELECT *, ST_GEOGPOINT(lon,lat) latlon_geo, ST_ASTEXT(ST_GEOGPOINT(lon,lat)) hsh 
  FROM `fh-bigquery.geocode.201806_geolite2_latlon_redux` 
)
SELECT AS VALUE 
  ARRAY_AGG(
    STRUCT(my_point, city_name, subdivision_1_name, country_name, continent_name) 
    LIMIT 1
  )[OFFSET(0)]
FROM (
  SELECT my_point, ST_ASTEXT(closest) hsh 
  FROM a, (SELECT ST_UNION_AGG(latlon_geo) arr FROM b),
  UNNEST([ST_CLOSESTPOINT(arr, my_point)]) closest
)
JOIN b 
USING(hsh)
GROUP BY ST_ASTEXT(my_point)

注意:

  • 我正在使用ST_CLOSESTPOINT函数
  • 为了模拟not just few points ...的情况,我使用的是与b中相同的表,因此有10万个点可搜索最近的城市,并且对可查询的城市有多近也没有限制(对于在这种情况下-原始答案中的查询将以著名的Query exceeded resource limits结尾-否则,如果不是最佳答案,它将显示出更好的效果,因为它在该答案中确实有所说明)