我有pgpool的这个配置:“Host-1”master和“Host-2”slave,如果“Host-1”关闭,pgpool正确地将“Host-2”提升为master;但是如果“Host-1”返回,pgpool不知道这一点,如果“Host-2”出现故障,pgpool不会将“Host-1”提升为主节点,即使“Host-1”是线上。 我启用了health_check,但它似乎完全无用,因为“Host-1”的状态(在它变为up之后)总是3 =“Node is down”。
这是事件期间命令“show pool_nodes”的输出:
- >最初情况:“Host-1”UP(主站),“Host-2”UP(从站)
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+--------
0 | Host-1 | 5432 | 2 | nan | master
1 | Host-2 | 5432 | 1 | nan | slave
- >节点0关闭:“Host-1”DOWN,“Host-2”UP
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+--------
0 | Host-1 | 5432 | 3 | nan | slave
1 | Host-2 | 5432 | 2 | nan | master
- >节点0返回:“Host-1”UP,“Host-2”UP
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+--------
0 | Host-1 | 5432 | 3 | nan | slave
1 | Host-2 | 5432 | 2 | nan | master
请注意,“Host-1”的状态为3表示“Node is down”
- >节点1出现故障:“Host-1”UP,“Host-2”DOWN:此时我无法连接到db,即使节点0已启动并运行!
我需要做什么才能允许pgpool再次提升节点0的主控权? 如果有用,这些是我的pgpool.conf的“后端连接设置”和“健康检查”部分:
# - Backend Connection Settings -
backend_hostname0 = 'Host-1'
# Host name or IP address to connect to for backend 0
backend_port0 = 5432
# Port number for backend 0
#backend_weight0 = 1
# Weight for backend 0 (only in load balancing mode)
#backend_data_directory0 = '/data'
# Data directory for backend 0
backend_flag0 = 'ALLOW_TO_FAILOVER'
# Controls various backend behavior
# ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER
backend_hostname1 = 'Host-2'
# Host name or IP address to connect to for backend 0
backend_port1 = 5432
# Port number for backend 0
#backend_weight1 = 1
# Weight for backend 0 (only in load balancing mode)
#backend_data_directory1 = '/data'
# Data directory for backend 0
backend_flag1 = 'ALLOW_TO_FAILOVER'
# Controls various backend behavior
# ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER
#------------------------------------------------------------------------------
# HEALTH CHECK
#------------------------------------------------------------------------------
health_check_period = 10
# Health check period
# Disabled (0) by default
health_check_timeout = 20
# Health check timeout
# 0 means no timeout
health_check_user = 'admin'
# Health check user
health_check_password = '12345'
# Password for health check user
health_check_max_retries = 10
# Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
# Amount of time to wait (in seconds) between retries.
答案 0 :(得分:3)
一旦您的从属节点启动并且复制正在运行,您需要将节点重新连接到pgpool。
$ pcp_attach_node 10 pgpool_host 9898 admin _pcp_passwd_ 0
最后一个参数是节点ID,对于你的情况,它是0。
请参阅http://www.pgpool.net/docs/latest/pgpool-en.html#pcp_attach_node详细信息。
答案 1 :(得分:0)
在升级之前,您必须将 up 从属节点。这意味着,在您的情况下,使用Slony完全故障转移并重建前Master作为新Slave。
基本问题是写入新主服务器的写入必须先复制到旧主服务器,然后才能进行故障恢复。这首先是一个Slony问题。在验证Slony正在运行并且所有内容都已复制之后,您可以对pgpool端进行故障排除,但直到那时(然后您可能需要将其重新连接到PGPool)。在主/从模式下使用PGPool时,PGPool是您正在使用的任何其他复制系统的辅助。