PostGIS:查找多边形和圆之间的交点的优化方法

时间:2018-09-14 05:23:01

标签: postgis

我正在尝试使用PostGIS查找事件(多边形)和监视区(圆-点和半径)之间的交点。基线数据将超过1万个多边形和50万个圆。另外,我对PostGIS还是很陌生。

我尝试了一些事情,但是执行花费了很长时间。有人可以建议仅使用PostGIS进行任何优化或更好的方法。这是我尝试过的-

1。使用几何数据类型: 我已将事件和监视区存储在几何类型中。 在它们上创建GIST索引,然后使用ST_DWITHIN查找交点。

具有1个事件和500,000个监视区的输出大约花费了6.750sec。在这里,花费的时间是最佳的,但是问题是我的半径以米为单位,并且几何类型为ST_DWithin要求它使用SRID单位。我无法弄清楚这种转换。

CREATE TABLE incident (
 incident_id SERIAL NOT NULL, 
 incident_name VARCHAR(20), 
 incident_span GEOMETRY(POLYGON, 4326), 
 CONSTRAINT incident_id PRIMARY KEY (incident_id)
);
CREATE TABLE watchzones (
 id SERIAL NOT NULL, 
 date_created timestamp with time zone DEFAULT now(), 
 latitude NUMERIC(10, 7) DEFAULT NULL, 
 Longitude NUMERIC(10, 7) DEFAULT NULL, 
 radius integer, 
 position GEOMETRY(POINT, 4326), 
 CONSTRAINT id PRIMARY KEY (id)
);

CREATE INDEX ix_spatial_geom on watchzones using gist(position);
CREATE INDEX ix_spatial_geom_1 on incident using gist(incident_span);


Insert into incident values (
   1, 
   'test', 
   ST_GeomFromText('POLYGON((152.945470916 -29.212227933,152.942130026 -29.213431145,152.939345911 -29.2125423759999,152.935144791 -29.21454003,152.933185494 -29.2135838469999,152.929481762 -29.216065516,152.929698621 -29.217402937,152.927245999 
-29.219576,152.921539 -29.217676,152.918487996 -29.2113786959999,152.919254355 -29.206029929,152.919692387 -29.2027824419999,152.936020197 -29.207567346,152.944901258 -29.207729953,152.945470916 
-29.212227933))', 
     4326
     )
     );

insert into watchzones  
  SELECT generate_series(1, 500000) AS id, 
         now(), 
         -29.21073, 
         152.93322, 
         '50', 
         ST_GeomFromText('POINT( 152.93322 -29.21073)', 4326);


explain analyze SELECT wz.id, 
       i.incident_id 
FROM watchzones wz, 
     incident i 
WHERE ST_DWithin(incident_span,position,wz.radius);

    "Nested Loop  (cost=0.14..227467.00 rows=42 width=8) (actual time=0.142..1506.476 rows=500000 loops=1)"
"  ->  Seq Scan on watchzones wz  (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.109..47.822 rows=500000 loops=1)"
"  ->  Index Scan using ix_spatial_geom_1 on incident i  (cost=0.14..0.42 rows=1 width=284) (actual time=0.002..0.002 rows=1 loops=500000)"
"        Index Cond: (incident_span && st_expand(wz."position", (wz.radius)::double precision))"
"        Filter: ((wz."position" && st_expand(incident_span, (wz.radius)::double precision)) AND _st_dwithin(incident_span, wz."position", (wz.radius)::double precision))"
"Planning time: 0.150 ms"
"Execution time: 1523.312 ms"

2。使用地理位置数据类型:

此处有1个事件和50万个监视区的输出大约需要29.987sec,。请注意,我已经使用GIST和BRIN索引进行了尝试,并且还在表上运行了VACUUM ANALYZE。

CREATE TABLE watchzones_geog 
         ( 
            id SERIAL PRIMARY KEY, 
            date_created TIMESTAMP with time zone DEFAULT now(), 
            latitude NUMERIC(10, 7) DEFAULT NULL, 
            longitude NUMERIC(10, 7) DEFAULT NULL, 
            radius INTEGER, 
            position geography(point) 
         );


CREATE INDEX watchzones_geog_gix ON watchzones_geog USING GIST (position);

insert into watchzones_geog
SELECT generate_series(1,500000) AS id, now(),-29.21073,152.93322,'50',ST_GeogFromText('POINT(152.93322 -29.21073)');

 CREATE TABLE incident_geog (
    incident_id    SERIAL PRIMARY KEY,
    incident_name   VARCHAR(20),
    incident_span      GEOGRAPHY(POLYGON)
);

    CREATE INDEX incident_geog_gix ON incident_geog USING GIST (incident_span);

Insert into incident_geog values (1,'test', ST_GeogFromText
('POLYGON((152.945470916 -29.212227933,152.942130026 -29.213431145,152.939345911 -29.2125423759999,152.935144791 -29.21454003,152.933185494 -29.2135838469999,152.929481762 -29.216065516,152.929698621 -29.217402937,152.927245999 
-29.219576,152.921539 -29.217676,152.918487996 -29.2113786959999,152.919254355 -29.206029929,152.919692387 -29.2027824419999,152.936020197 -29.207567346,152.944901258 -29.207729953,152.945470916 
-29.212227933))'));

explain analyze SELECT i.incident_id, 
       wz.id 
FROM   watchzones_geog wz, 
       incident_geog i 
WHERE  St_dwithin(position, incident_span, radius); 

"Nested Loop  (cost=0.27..348717.00 rows=17 width=8) (actual time=0.277..18551.844 rows=500000 loops=1)"
"  ->  Seq Scan on watchzones_geog wz  (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.102..50.052 rows=500000 loops=1)"
"  ->  Index Scan using incident_geog_gix on incident_geog i  (cost=0.27..0.67 rows=1 width=711) (actual time=0.036..0.036 rows=1 loops=500000)"
"        Index Cond: (incident_span && _st_expand(wz."position", (wz.radius)::double precision))"
"        Filter: ((wz."position" && _st_expand(incident_span, (wz.radius)::double precision)) AND _st_dwithin(wz."position", incident_span, (wz.radius)::double precision, true))"
"Planning time: 0.155 ms"
"Execution time: 18587.041 ms"

3。我也尝试过使用ST_Buffer(position, radius,'quad_segs=8')然后使用ST_Intersects创建一个圆。借助此查询,几何和地理数据类型都需要花费一分钟以上的时间。

如果有人可以提出一种更好的方法或进行优化以加快执行速度,那将是很棒的事情。

谢谢

1 个答案:

答案 0 :(得分:0)

查询很好,但是您的示例错误。首先,请注意,针对1个多边形优化的查询可能与针对数千个多边形的优化查询不同。

主要问题在于采样点。照原样,您在完全相同的位置上有500,000个点,因此根据相交的多边形,查询将返回0或500 000个结果。 Postgis首先使用索引使用方形框将点/多边形相交,然后通过计算真实距离来优化结果。使用您的样本,它必须计算出500,000次的距离,这很慢。

使用具有随机位置(1度以内)的点层,查询只需不到1秒的时间,因为它只需要计算20个位置的距离即可。

INSERT INTO watchzones_geog
SELECT generate_series(1,500000) AS id, now(),0,0,'50',
       ST_makePoint(152.93322+random(),-29.21073+random())::geography;


explain analyze SELECT i.incident_id, 
       wz.id 
FROM   watchzones_geog wz, 
       incident_geog i 
WHERE  St_dwithin(position, incident_span, radius); 
    Nested Loop  (cost=0.00..272424.01 rows=1 width=8) (actual time=25.956..921.846 rows=20 loops=1)
--------------------------------------------
   Join Filter: ((wz."position" && _st_expand(i.incident_span, (wz.radius)::double precision)) AND (i.incident_span && _st_expand(wz."position", (wz.radius)::double precision)) AND _st_dwithin(wz."position", i.incident_span, (wz.radius)::double precision, true))
   Rows Removed by Join Filter: 499980
   ->  Seq Scan on incident_geog i  (cost=0.00..1.01 rows=1 width=36) (actual time=0.009..0.009 rows=1 loops=1)
   ->  Seq Scan on watchzones_geog wz  (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.006..65.625 rows=500000 loops=1)
 Planning time: 1.887 ms
 Execution time: 921.895 ms