如何计算表中10000点的最近邻居距离

时间:2018-06-22 15:57:17

标签: postgresql postgis spatial knn spatial-query

我正在使用PostgreSQL,并且正在使用PostGIS扩展。

我可以对此查询进行比较:

SELECT st_distance(geom, 'SRID=4326;POINT(12.601828337172 50.5173393068512)'::geometry) as d
FROM pointst1
ORDER BY d 

但是我不想比较一个固定点,而是比较一列点。而且我想通过某种索引来做到这一点,以使其在计算上便宜,而不像该表中的交叉联接那样是10000x10000。

创建表:

create table pointst1
(
  id   integer not null
    constraint pointst1_id_pk
    primary key,
  geom geometry(Point, 4325)
);

create unique index pointst1_id_uindex
  on pointst1 (id);

create index geomidx
  on pointst1 (geom);

编辑: 改进的查询(与最近的邻居比较10000个点,但得到的是该点本身为0而不是下一个最近点的结果:

select points.*,
  p1.id as p1_id,
  ST_Distance(geography(p1.geom), geography(points.geom)) as distance
from
  (select distinct on(p2.geom)*
  from pointst1 p2
  where p2.id is not null) as points
cross join lateral
  (select id, geom
  from pointst1
  order  by points.geom <-> geom
           limit 1) as p1;

1 个答案:

答案 0 :(得分:1)

您的查询已经在计算从给定几何图形到表pointst1中所有记录的距离。

考虑这些值..

INSERT INTO pointst1 VALUES (1,'SRID=4326;POINT(16.19 48.21)'),
                            (2,'SRID=4326;POINT(18.96 47.50)'),
                            (3,'SRID=4326;POINT(13.47 52.52)'),
                            (4,'SRID=4326;POINT(-3.70 40.39)');

...如果您运行查询,它将已经计算出表格中所有点的距离:

SELECT ST_Distance(geom, 'SRID=4326;POINT(12.6018 50.5173)'::geometry) as d
FROM pointst1
ORDER BY d

        d         
------------------
  2.1827914536208
 4.26600662563949
 7.03781262396208
 19.1914274750473
(4 Zeilen)

将索引更改为最适合几何数据的GIST

create index geomidx on pointst1 using GIST (geom);

请注意,由于您正在执行全面扫描,因此索引不会加快您的查询速度。但是,一旦您开始在where子句中玩更多游戏,您可能会看到一些改进。

编辑:

WITH j AS (SELECT id AS id2, geom AS geom2 FROM pointst1) 
SELECT id,j.id2,ST_Distance(geom, j.geom2) AS d
FROM pointst1,j
WHERE id <> j.id2
ORDER BY id,id2  

 id | id2 |        d         
----+-----+------------------
  1 |   2 | 2.85954541841881
  1 |   3 |  5.0965184194703
  1 |   4 | 21.3720495039666
  2 |   1 | 2.85954541841881
  2 |   3 | 7.43911957156222
  2 |   4 | 23.7492673571207
  3 |   1 |  5.0965184194703
  3 |   2 | 7.43911957156222
  3 |   4 | 21.0225069865609
  4 |   1 | 21.3720495039666
  4 |   2 | 23.7492673571207
  4 |   3 | 21.0225069865609
(12 rows)

删除重复距离:

SELECT DISTINCT ON(d) * FROM (
WITH j AS (SELECT id AS id2, geom AS geom2 FROM pointst1) 
SELECT id,j.id2,ST_Distance(geom, j.geom2) AS d
FROM pointst1,j
WHERE id <> j.id2
ORDER BY id,id2) AS j

 id | id2 |        d         
----+-----+------------------
  1 |   2 | 2.85954541841881
  3 |   1 |  5.0965184194703
  3 |   2 | 7.43911957156222
  4 |   3 | 21.0225069865609
  4 |   1 | 21.3720495039666
  2 |   4 | 23.7492673571207
(6 rows)