当有人试图查询非常常见的内容时,我正在努力提高搜索性能。我有一个包含530万条记录及其邮寄地址的数据库,其中很大一部分的常用词是" road"," rd"," st&#34等等......所以当有人搜索时,需要很长时间。
如下所示,我尝试搜索不常见的内容(箭头):
pulsar_dev=# EXPLAIN ANALYZE SELECT
property->>'rollNumber',
property->>'municipalAddress',
property->>'municipalityDescription'
FROM
properties_cmv
WHERE
to_tsvector('simple', property->>'municipalAddress') ||
to_tsvector('simple', property->>'municipalityDescription') ||
to_tsvector('simple', property->>'countyDescription') @@ plainto_tsquery('arrowhead')
ORDER BY ts_rank(to_tsvector('simple', property->>'municipalAddress') ||
to_tsvector('simple', property->>'municipalityDescription') ||
to_tsvector('simple', property->>'countyDescription'), plainto_tsquery('arrowhead')) DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4420.99..4424.11 rows=1248 width=23) (actual time=136.957..137.047 rows=490 loops=1)
Sort Key: (ts_rank(((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipalityDesc
ription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))), plainto_tsquery('arrowhead'::text)))
Sort Method: quicksort Memory: 93kB
-> Bitmap Heap Scan on properties_cmv (cost=25.69..4356.81 rows=1248 width=23) (actual time=0.350..136.566 rows=490 loops=1)
Recheck Cond: (((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipalityDe
scription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))) @@ plainto_tsquery('arrowhead'::text))
Heap Blocks: exact=39
-> Bitmap Index Scan on prop_address_idx (cost=0.00..25.38 rows=1248 width=0) (actual time=0.072..0.072 rows=490 loops=1)
Index Cond: (((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipali
tyDescription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))) @@ plainto_tsquery('arrowhead'::text))
Planning time: 0.213 ms
Execution time: 137.184 ms
(10 rows)
它非常快,但是当我搜索" road"时,它并不快:
pulsar_dev=# EXPLAIN ANALYZE SELECT
property->>'rollNumber',
property->>'municipalAddress',
property->>'municipalityDescription'
FROM
properties_cmv
WHERE
to_tsvector('simple', property->>'municipalAddress') ||
to_tsvector('simple', property->>'municipalityDescription') ||
to_tsvector('simple', property->>'countyDescription') @@ plainto_tsquery('road')
ORDER BY ts_rank(to_tsvector('simple', property->>'municipalAddress') ||
to_tsvector('simple', property->>'municipalityDescription') ||
to_tsvector('simple', property->>'countyDescription'), plainto_tsquery('road')) DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=25533.10..25560.73 rows=11051 width=23) (actual time=11065.051..11066.883 rows=10356 loops=1)
Sort Key: (ts_rank(((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipalityDesc
ription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))), plainto_tsquery('road'::text)))
Sort Method: quicksort Memory: 1841kB
-> Bitmap Heap Scan on properties_cmv (cost=117.67..24790.93 rows=11051 width=23) (actual time=1.911..11052.683 rows=10356 loops=1)
Recheck Cond: (((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipalityDe
scription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))) @@ plainto_tsquery('road'::text))
Heap Blocks: exact=1408
-> Bitmap Index Scan on prop_address_idx (cost=0.00..114.91 rows=11051 width=0) (actual time=1.432..1.432 rows=10356 loops=1)
Index Cond: (((to_tsvector('simple'::regconfig, (property ->> 'municipalAddress'::text)) || to_tsvector('simple'::regconfig, (property ->> 'municipali
tyDescription'::text))) || to_tsvector('simple'::regconfig, (property ->> 'countyDescription'::text))) @@ plainto_tsquery('road'::text))
Planning time: 0.210 ms
Execution time: 11069.142 ms
(10 rows)
如何改善第二个查询的性能?我还需要对结果进行排名,首先返回最相关的结果。
在elasticsearch上运行类似的测试,以毫秒为单位。
答案 0 :(得分:3)
我创建了一个新表并将连接的tsvector保存在一个列和索引中,它似乎提高了速度。