我有一个非常简单的数据库,我用它为Elasticsearch构建一些JSON。我是唯一连接到数据库的人,并且仅将其用于此特定任务。无论如何,在数据库上点击一些表格是非常慢的,这么慢,以至于构建我正在研究的Elasticsearch索引需要几个月的时间。我查看了有关Postgresql调优等的所有文章,而我正在改变的是解决问题。正如您在EXPLAIN ANALYZE部分中看到的那样,我正在进行的查询是一个非常简单的查询,没有连接或其他任何东西,而且它仍然非常慢。我只想弄清楚我能做些什么来加快速度,因为我认为没有理由这么慢。
以下是我能想到的所有相关信息,如果需要更多信息,我可以补充一下。
系统规格
[root@pgdb ~]# free -m
total used free shared buffers cached
Mem: 61444 39416 22027 458 8 38531
-/+ buffers/cache: 876 60568
Swap: 0 0 0
[root@pgdb ~]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
所有表格
cotitsk=# \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | company | table | postgres | 1838 MB |
public | company_industry | table | postgres | 1621 MB |
public | company_title | table | postgres | 3837 MB |
public | industry | table | postgres | 224 kB |
public | industry_skill | table | postgres | 446 MB |
public | industry_title | table | postgres | 1229 MB |
public | interest | table | postgres | 344 MB |
public | skill | table | postgres | 438 MB |
public | skill_skill | table | postgres | 21 GB |
public | title | table | postgres | 1841 MB |
public | title_interest | table | postgres | 2799 MB |
public | title_skill | table | postgres | 27 GB |
public | title_title | table | postgres | 11 GB |
(13 rows)
正在查询的表格架构
cotitsk=# \d+ skill_skill
Table "public.skill_skill"
Column | Type | Modifiers | Storage | Stats target | Description
-----------+--------+-----------+---------+--------------+-------------
skill1_id | bigint | not null | plain | |
skill2_id | bigint | not null | plain | |
count | bigint | not null | plain | |
Foreign-key constraints:
"skill_skill_skill1_fk" FOREIGN KEY (skill1_id) REFERENCES skill(skill_id)
"skill_skill_skill2_fk" FOREIGN KEY (skill2_id) REFERENCES skill(skill_id)
正在查询的表的近似行数
cotitsk=# SELECT reltuples::bigint AS estimate FROM pg_class where relname='skill_skill';
estimate
-----------
435104320
(1 row)
解释分析
cotitsk=# explain analyze select * from skill_skill where skill1_id = '2701941' order by count desc limit 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=8228458.83..8228461.33 rows=1000 width=24) (actual time=37163.205..37163.600 rows=1000 loops=1)
-> Sort (cost=8228458.83..8229292.78 rows=333580 width=24) (actual time=37163.203..37163.394 rows=1000 loops=1)
Sort Key: count DESC
Sort Method: top-N heapsort Memory: 127kB
-> Seq Scan on skill_skill (cost=0.00..8210169.00 rows=333580 width=24) (actual time=14021.081..37128.913 rows=210902 loops=1)
Filter: (skill1_id = '2701941'::bigint)
Rows Removed by Filter: 434893298
Planning time: 0.062 ms
Execution time: 37163.747 ms
(9 rows)
postgresql.conf中
max_connections = 1000
shared_buffers = 16GB
work_mem = 100MB
maintenance_work_mem = 1GB
dynamic_shared_memory_type = posix
wal_buffers = 16MB
effective_cache_size = 48GB
logging_collector = on
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_timezone = 'UTC'