我有一个相当大的表(200万条记录),名为places
。我使用两个名为numeric(9,6)
和latitude
的{{1}}列来存储地理位置。
现在我经常要问:“从一个点到x公里(半径)内有多少个地方。
我使用类似的查询来做到这一点:
longitude
我的索引如下:
SELECT COUNT(*) AS active_count
FROM de."places"
WHERE "places"."state" = 'active'
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 ))
我的硬件非常强大(64核,192GB内存,硬件RAID阵列中的8x Enterprice SSD等) 现在,如果我做一个解释,我会得到类似的东西:
CREATE INDEX index_places_location
ON de.places USING gist
(extensions.st_geographyfromtext(((('SRID=4326;POINT('::text || longitude) || ' '::text) || latitude) || ')'::text))
TABLESPACE pg_default WHERE state::text = 'active'::text
;
我不了解您,但是我不知何故希望更快一些。现在,我是否可以忽略一些我应该用于此目的的超级智能PostGIS索引功能,而不是我正在做的atm?
PS:Postgresql 11和Postgis 2.5
更新
"Finalize Aggregate (cost=512320.91..512320.92 rows=1 width=8) (actual time=1677.327..1677.327 rows=1 loops=1)"
" -> Gather (cost=512320.28..512320.89 rows=6 width=8) (actual time=1675.946..1732.657 rows=7 loops=1)"
" Workers Planned: 6"
" Workers Launched: 6"
" -> Partial Aggregate (cost=511320.28..511320.29 rows=1 width=8) (actual time=1655.383..1655.384 rows=1 loops=7)"
" -> Parallel Bitmap Heap Scan on places (cost=125298.79..511310.07 rows=4085 width=0) (actual time=1506.195..1655.008 rows=3781 loops=7)"
" Recheck Cond: ((extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography) AND ((state)::text = 'active'::text))"
" Filter: (('0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography OPERATOR(extensions.&&) extensions._st_expand(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '15000'::double precision)) AND extensions._st_dwithin(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography, '15000'::double precision, true))"
" Rows Removed by Filter: 1380"
" Heap Blocks: exact=12774"
" -> Bitmap Index Scan on index_places_location (cost=0.00..125292.67 rows=367634 width=0) (actual time=1501.179..1501.179 rows=89886 loops=1)"
" Index Cond: (extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography)"
"Planning Time: 0.786 ms"
"Execution Time: 1732.762 ms"
输出:
SET enable_bitmapscan = off;
explain (analyze, buffers) SELECT COUNT(*) AS active_count
FROM de."places" WHERE "places"."state" = 'active'
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 ))
那更快,为什么?
答案 0 :(得分:0)
您的位图扫描有点奇怪。索引扫描找到了89886行,过滤器删除了1380行,但只剩下3781个。我必须得出结论,您的表在真空状态下是可悲的。我不确定在所有并行工作者中报告了哪个数目,而仅对一个进行了报告,但是我认为那可能不足以解释这一差异。
您是否交替轮流运行两个查询,以确保结果不只是偶然,还是由于缓存影响? (另外,请遵循Laurenz的建议,并尽可能先打开track_io_timing)