提高我的psql查询性能

时间:2015-09-18 10:54:05

标签: postgresql optimization full-text-search

我有一个执行奇数的psql查询。我在我现在用于搜索的2列上定义了一个GIN索引:

Indexes:
"pk_products" PRIMARY KEY, btree (id)
"fk_affiliate_affiliate_product_id" UNIQUE, btree (affiliate_id, affiliate_product_id)
"idx_products" btree (merchant_id)
"idx_products_affiliates" btree (affiliate_id)
"idx_products_brand_id" btree (brand_id)
"idx_products_ts" gin (to_tsvector('english'::regconfig, (COALESCE(title, ''::character varying)::text || ' '::text) || COALESCE(description, ''::text)))

如果我搜索一个简短的单词,比如4个字符,我会得到一个快速查询:

EXPLAIN ANALYZE SELECT p.id, p.price, p.currency, p.images, p.merchant_id
FROM products AS p
WHERE deleted=false AND to_tsvector('english', p.title || coalesce(p.description, '')) @@ to_tsquery('blue:*')
LIMIT 30 OFFSET 0;

结果:

查询计划

 Limit  (cost=0.00..219.47 rows=30 width=49) (actual time=3.138..40.914 rows=30 loops=1)
     ->  Seq Scan on products p  (cost=0.00..41120.86 rows=5621 width=49) (actual time=2.740..40.478 rows=30 loops=1)
     Filter: ((NOT deleted) AND (to_tsvector('english'::regconfig, ((title)::text || COALESCE(description, ''::text))) @@ to_tsquery('blue:*'::text)))
     Rows Removed by Filter: 153
  Total runtime: 40.986 ms
  (5 rows)

如果我使用更长的词:

EXPLAIN ANALYZE SELECT p.id, p.price, p.currency, p.images, p.merchant_id
FROM products AS p
WHERE deleted=false AND to_tsvector('english', p.title || coalesce(p.description, '')) @@ to_tsquery('turquoise:*')
LIMIT 30 OFFSET 0;

时间增加了:

查询计划

 Limit  (cost=0.00..219.47 rows=30 width=49) (actual time=1.097..1579.187 rows=30 loops=1)
   ->  Seq Scan on products p  (cost=0.00..41120.86 rows=5621 width=49) (actual time=1.093..1579.129 rows=30 loops=1)
     Filter: ((NOT deleted) AND (to_tsvector('english'::regconfig, ((title)::text || COALESCE(description, ''::text))) @@ to_tsquery('turquoise:*'::text)))
     Rows Removed by Filter: 12697
 Total runtime: 1579.287 ms
 (5 rows)

如果我在单词中使用“ - ”,那么得到结果的时间是巨大的:

EXPLAIN ANALYZE SELECT p.id, p.price, p.currency, p.images, p.merchant_id
FROM products AS p
WHERE deleted=false AND to_tsvector('english', p.title || coalesce(p.description, '')) @@ to_tsquery('turquoise-blue:*')
LIMIT 30 OFFSET 0;

结果:

查询计划

 Limit  (cost=0.00..41120.86 rows=2 width=49) (actual time=31400.164..31400.164 rows=0 loops=1)
   ->  Seq Scan on products p  (cost=0.00..41120.86 rows=2 width=49) (actual time=31400.158..31400.158 rows=0 loops=1)
     Filter: ((NOT deleted) AND (to_tsvector('english'::regconfig, ((title)::text || COALESCE(description, ''::text))) @@ to_tsquery('turquoise-blue:*'::text)))
     Rows Removed by Filter: 281510
 Total runtime: 31400.247 ms
 (5 rows)

非常感谢任何想法!谢谢!

编辑:

我认为与结果数量相关,没有结果的查询需要更长时间?

1 个答案:

答案 0 :(得分:1)

发现问题。

改变了这样的索引:

"idx_products_ts" gin (to_tsvector('english'::regconfig, title::text || description))

现在查询真的很快!

EXPLAIN ANALYZE SELECT p.id, p.price, p.currency, p.images, p.merchant_id
FROM products AS p
WHERE deleted=false AND to_tsvector('english', p.title || p.description) @@ to_tsquery('blue:*')
LIMIT 30 OFFSET 0;

查询计划

Limit  (cost=103.64..183.47 rows=30 width=49) (actual time=15.588..15.644 rows=30 loops=1)
 ->  Bitmap Heap Scan on products p  (cost=103.64..15084.42 rows=5630 width=49) (actual time=15.586..15.633 rows=30 loops=1)
     Recheck Cond: (to_tsvector('english'::regconfig, ((title)::text || description)) @@ to_tsquery('blue:*'::text))
     Filter: (NOT deleted)
     ->  Bitmap Index Scan on idx_products_ts  (cost=0.00..102.23 rows=5630 width=0) (actual time=12.955..12.955 rows=26747 loops=1)
           Index Cond: (to_tsvector('english'::regconfig, ((title)::text || description)) @@ to_tsquery('blue:*'::text))
Total runtime: 15.714 ms
(7 rows)

EXPLAIN ANALYZE SELECT p.id, p.price, p.currency, p.images, p.merchant_id
FROM products AS p
WHERE deleted=false AND to_tsvector('english', p.title || p.description) @@ to_tsquery('turquoise-blue:*')
LIMIT 30 OFFSET 0;

查询计划

Limit  (cost=108.02..116.02 rows=2 width=49) (actual time=26.234..26.234 rows=0 loops=1)
  ->  Bitmap Heap Scan on products p  (cost=108.02..116.02 rows=2 width=49) (actual time=26.226..26.226 rows=0 loops=1)
     Recheck Cond: (to_tsvector('english'::regconfig, ((title)::text || description)) @@ to_tsquery('turquoise-blue:*'::text))
     Filter: (NOT deleted)
     ->  Bitmap Index Scan on idx_products_ts  (cost=0.00..108.02 rows=2 width=0) (actual time=26.209..26.209 rows=0 loops=1)
           Index Cond: (to_tsvector('english'::regconfig, ((title)::text || description)) @@ to_tsquery('turquoise-blue:*'::text))
Total runtime: 26.433 ms
(7 rows)

总运行时间的巨大改进:31400.247 ms到总运行时间:26.433 ms。