我在PostgreSQL数据库中拥有近60亿条记录的表格。表的列email_id定义为
character varying(64)
我正在尝试通过此列优化搜索。例如,查询:
select count(1) from my_table where email_id = 'some@email.com';
需要约190秒才能完成并返回结果。我尝试在该列上创建索引,如:
CREATE INDEX my_table_idx_email_id
ON my_table
USING btree
(email_id);
但根本没有明显改善甚至改善。我还尝试使用explain analyze语句分析查询,并确认问题出现在电子邮件列中。
解释分析的示例输出:
explain analyze select count(1) from my_table where email_id = 'test@unknown.email';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5211284.25..5211284.26 rows=1 width=0) (actual time=225424.749..225424.749 rows=1 loops=1)
-> Seq Scan on my_table (cost=0.00..5211235.72 rows=19410 width=0) (actual time=225424.744..225424.744 rows=0 loops=1)
Filter: ((email_id)::text = 'test@unknown.email'::text)
Total runtime: 225426.646 ms
解释enable_seqscan = off后的分析输出:
SET enable_seqscan = off;
explain analyze select count(1) from my_table where email_id = 'test@unknown.email';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10005215244.40..10005215244.41 rows=1 width=0) (actual time=282110.404..282110.405 rows=1 loops=1)
-> Seq Scan on my_table (cost=10000000000.00..10005215195.84 rows=19425 width=0) (actual time=282110.393..282110.393 rows=0 loops=1)
Filter: ((email_id)::text = 'test@unknown.email'::text)
Total runtime: 282113.296 ms