有什么方法可以使用postgis更好地索引地理查询以提高查询性能

时间:2019-12-02 13:58:30

标签: postgresql postgis

我有一个相当大的表(200万条记录),名为places。我使用两个名为numeric(9,6)latitude的{​​{1}}列来存储地理位置。

现在我经常要问:“从一个点到x公里(半径)内有多少个地方。

我使用类似的查询来做到这一点:

longitude

我的索引如下:

SELECT COUNT(*) AS active_count 
FROM de."places" 
WHERE "places"."state" = 'active'
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 )) 

我的硬件非常强大(64核,192GB内存,硬件RAID阵列中的8x Enterprice SSD等) 现在,如果我做一个解释,我会得到类似的东西:

CREATE INDEX index_places_location
    ON de.places USING gist
    (extensions.st_geographyfromtext(((('SRID=4326;POINT('::text || longitude) || ' '::text) || latitude) || ')'::text))
    TABLESPACE pg_default    WHERE state::text = 'active'::text
;

我不了解您,但是我不知何故希望更快一些。现在,我是否可以忽略一些我应该用于此目的的超级智能PostGIS索引功能,而不是我正在做的atm?

PS:Postgresql 11和Postgis 2.5

更新

"Finalize Aggregate  (cost=512320.91..512320.92 rows=1 width=8) (actual time=1677.327..1677.327 rows=1 loops=1)"
"  ->  Gather  (cost=512320.28..512320.89 rows=6 width=8) (actual time=1675.946..1732.657 rows=7 loops=1)"
"        Workers Planned: 6"
"        Workers Launched: 6"
"        ->  Partial Aggregate  (cost=511320.28..511320.29 rows=1 width=8) (actual time=1655.383..1655.384 rows=1 loops=7)"
"              ->  Parallel Bitmap Heap Scan on places  (cost=125298.79..511310.07 rows=4085 width=0) (actual time=1506.195..1655.008 rows=3781 loops=7)"
"                    Recheck Cond: ((extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography) AND ((state)::text = 'active'::text))"
"                    Filter: (('0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography OPERATOR(extensions.&&) extensions._st_expand(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '15000'::double precision)) AND extensions._st_dwithin(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography, '15000'::double precision, true))"
"                    Rows Removed by Filter: 1380"
"                    Heap Blocks: exact=12774"
"                    ->  Bitmap Index Scan on index_places_location  (cost=0.00..125292.67 rows=367634 width=0) (actual time=1501.179..1501.179 rows=89886 loops=1)"
"                          Index Cond: (extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography)"
"Planning Time: 0.786 ms"
"Execution Time: 1732.762 ms"

输出:

SET enable_bitmapscan = off;
explain (analyze, buffers) SELECT COUNT(*) AS active_count 
FROM de."places" WHERE "places"."state" = 'active' 
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 ))

那更快,为什么?

1 个答案:

答案 0 :(得分:0)

您的位图扫描有点奇怪。索引扫描找到了89886行,过滤器删除了1380行,但只剩下3781个。我必须得出结论,您的表在真空状态下是可悲的。我不确定在所有并行工作者中报告了哪个数目,而仅对一个进行了报告,但是我认为那可能不足以解释这一差异。

您是否交替轮流运行两个查询,以确保结果不只是偶然,还是由于缓存影响? (另外,请遵循Laurenz的建议,并尽可能先打开track_io_timing)