PostgreSQL autovacuum分区表

时间:2018-10-09 17:40:26

标签: postgresql partition autovacuum

AWS中的PostgreSQL 9.5.2 RDS

select name,setting from pg_settings 
where name like '%vacuum%' 
order by name;
                name                 |  setting
-------------------------------------+-----------
 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.05
 autovacuum_analyze_threshold        | 50
 autovacuum_freeze_max_age           | 450000000
 autovacuum_max_workers              | 3
 autovacuum_multixact_freeze_max_age | 400000000
 autovacuum_naptime                  | 30
 autovacuum_vacuum_cost_delay        | 20
 autovacuum_vacuum_cost_limit        | -1
 autovacuum_vacuum_scale_factor      | 0.1
 autovacuum_vacuum_threshold         | 50
 autovacuum_work_mem                 | -1
 log_autovacuum_min_duration         | 0
 rds.force_autovacuum_logging_level  | log
 vacuum_cost_delay                   | 0
 vacuum_cost_limit                   | 300
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 vacuum_defer_cleanup_age            | 0
 vacuum_freeze_min_age               | 50000000
 vacuum_freeze_table_age             | 250000000
 vacuum_multixact_freeze_min_age     | 5000000
 vacuum_multixact_freeze_table_age   | 150000000

我一直在尝试弄清两个Postgres数据库中自动吸尘的工作原理。数据库的大小,参数和结构均相同。 (这是用于同一应用程序的两个数据仓库-不同的位置和不同的数据模式)。

我们正在对一些非常大的表使用分区。我注意到,较旧的(静态)分区经常会自动清理。我了解XID已冻结,但是该关系确实需要定期清理以寻找新的XID。

我一直在使用此查询来查找需要清理的关系,以避免XID环绕:

SELECT 'Relation Name',age(c.relfrozenxid) c_age, age(t.relfrozenxid) t_age,
       greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm')
order by age desc limit 5;

   ?column?    |   c_age   |   t_age   |    age
---------------+-----------+-----------+-----------
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 |           | 461544753
 Relation Name | 461544753 | 310800517 | 461544753

列出的所有关系都是旧的稳定分区。 relfrozenxid列定义为:“此表之前的所有事务ID已在此表中替换为永久(“冻结”)事务ID。用于跟踪是否需要清理该表以防止事务ID绕回或允许pg_clog缩小。”

出于好奇,我查看了特定表的所有分区的relfrozenxid:

SELECT c.oid::regclass as table_name,age(c.relfrozenxid) as age , c.reltuples::int, n_live_tup, n_dead_tup,
         date_trunc('day',last_autovacuum)
FROM pg_class c
JOIN pg_stat_user_tables u on c.relname = u.relname
WHERE c.relkind IN ('r', 'm')
and  c.relname like 'tablename%'

      table_name                     |    age    | reltuples | n_live_tup | n_dead_tup |       date_trunc
-------------------------------------+-----------+-----------+------------+------------+------------------------
 schema_partition.tablename_201202   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201306   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201204   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201110   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201111   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201112   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201201   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201203   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201109   | 460250527 |         0 |          0 |          0 | 2018-09-23 00:00:00+00
 schema_partition.tablename_201801   | 435086084 |  37970232 |   37970230 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201307   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201107   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201312   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201311   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201401   | 433975635 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201310   | 423675180 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201704   | 423222113 |  43842668 |   43842669 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201612   | 423222113 |  65700844 |   65700845 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201705   | 423221655 |  46847336 |   46847338 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201702   | 423171142 |  50701032 |   50701031 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_overflow | 423171142 |       754 |        769 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201106   | 421207271 |         1 |          1 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201309   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201108   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201308   | 421207271 |         0 |          0 |          0 | 2018-09-25 00:00:00+00
 schema_partition.tablename_201806   | 374122782 |  44626756 |   44626757 |          0 | 2018-09-26 00:00:00+00
 schema.tablename                    | 360135561 |         0 |          0 |          0 | 2018-09-27 00:00:00+00

我敢肯定我不太了解relfrozenxid的工作原理,但是看来分区表受父表的影响(这会影响分区表的relfrozenxid值)。我找不到与此有关的任何文档。我认为对于静态表,relfrozenxid将保持静态,直到发生真空为止。

此外,我还有一些关系,这些关系具有静态数据,这些数据显然从未自动清理过(last_autovacuum为null)。这可能是VACUUM FREEZE操作的结果吗?

我是Postgres的新手,并且我很容易承认自己不完全了解自动真空过程。

我没有发现可以发现的性能问题。

编辑:

我设置了一个查询,该查询每4小时对一个分区表运行一次:

SELECT c.oid::regclass as table_name,age(c.relfrozenxid) as age , c.reltuples::int, n_live_tup, n_dead_tup,
         date_trunc('day',last_autovacuum)
FROM pg_class c
JOIN pg_stat_user_tables u on c.relname = u.relname
WHERE c.relkind IN ('r', 'm')
and  c.relname like 'sometable%'
order by age desc;

在这里查看两个不同的分区是最近20小时的输出:

 schemaname.sometable_201812   | 206286536 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 206286537 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465100 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465162 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 225465342 |         0 |          0 |          0 |
 schemaname.sometable_201812   | 236408374 |         0 |          0 |          0 |
-bash-4.2$  grep 201610 test1.out
 schemaname.sometable_201610   | 449974426 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   | 449974427 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   | 469152990 |  31348368 |   31348369 |          0 | 2018-09-22 00:00:00+00
 schemaname.sometable_201610   |  50000051 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00
 schemaname.sometable_201610   |  50000231 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00
 schemaname.sometable_201610   |  60943263 |  31348368 |   31348369 |          0 | 2018-10-10 00:00:00+00

即使没有直接DML到分区,分区的relfrozenxid也被修改。我认为插入基表会以某种方式修改分区的relfrozenxid。

sometable_201610分区有3100万行,但是静态的。当我查看日志文件时,这种类型的分区需要20到30分钟。我不知道这是否是性能问题,但看起来确实很昂贵。查看日志文件中的autovacuum,可以发现通常每天晚上都有几个这样的大型分区被自动清空。 (也有很多具有零元组的分区是自动清空的,但是花费很少的时间。)

0 个答案:

没有答案