pgpool-II:是否有可能促使节点多次掌握?

时间:2013-10-05 14:03:51

标签: database postgresql fallback pgpool

我有pgpool的这个配置:“Host-1”master和“Host-2”slave,如果“Host-1”关闭,pgpool正确地将“Host-2”提升为master;但是如果“Host-1”返回,pgpool不知道这一点,如果“Host-2”出现故障,pgpool不会将“Host-1”提升为主节点,即使“Host-1”是线上。 我启用了health_check,但它似乎完全无用,因为“Host-1”的状态(在它变为up之后)总是3 =“Node is down”。

这是事件期间命令“show pool_nodes”的输出:

- >最初情况:“Host-1”UP(主站),“Host-2”UP(从站)

 node_id | hostname | port | status | lb_weight |  role
---------+----------+------+--------+-----------+--------
 0       | Host-1    | 5432 | 2      | nan       | master
 1       | Host-2    | 5432 | 1      | nan       | slave

- >节点0关闭:“Host-1”DOWN,“Host-2”UP

 node_id | hostname | port | status | lb_weight |  role
---------+----------+------+--------+-----------+--------
 0       | Host-1    | 5432 | 3      | nan       | slave
 1       | Host-2    | 5432 | 2      | nan       | master

- >节点0返回:“Host-1”UP,“Host-2”UP

 node_id | hostname | port | status | lb_weight |  role
---------+----------+------+--------+-----------+--------
 0       | Host-1    | 5432 | 3      | nan       | slave
 1       | Host-2    | 5432 | 2      | nan       | master

请注意,“Host-1”的状态为3表示“Node is down”

- >节点1出现故障:“Host-1”UP,“Host-2”DOWN:此时我无法连接到db,即使节点0已启动并运行!

我需要做什么才能允许pgpool再次提升节点0的主控权? 如果有用,这些是我的pgpool.conf的“后端连接设置”和“健康检查”部分:

# - Backend Connection Settings -

backend_hostname0 = 'Host-1'
                                   # Host name or IP address to connect to for backend 0
backend_port0 = 5432
                                   # Port number for backend 0
#backend_weight0 = 1
                                   # Weight for backend 0 (only in load balancing mode)
#backend_data_directory0 = '/data'
                                   # Data directory for backend 0
backend_flag0 = 'ALLOW_TO_FAILOVER'
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER

backend_hostname1 = 'Host-2'
                                   # Host name or IP address to connect to for backend 0
backend_port1 = 5432
                                   # Port number for backend 0
#backend_weight1 = 1
                                   # Weight for backend 0 (only in load balancing mode)
#backend_data_directory1 = '/data'
                                   # Data directory for backend 0
backend_flag1 = 'ALLOW_TO_FAILOVER'
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER

#------------------------------------------------------------------------------
# HEALTH CHECK
#------------------------------------------------------------------------------

health_check_period = 10
                                   # Health check period
                                   # Disabled (0) by default
health_check_timeout = 20
                                   # Health check timeout
                                   # 0 means no timeout
health_check_user = 'admin'
                                   # Health check user
health_check_password = '12345'
                                   # Password for health check user
health_check_max_retries = 10
                                   # Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
                                   # Amount of time to wait (in seconds) between retries.

2 个答案:

答案 0 :(得分:3)

一旦您的从属节点启动并且复制正在运行,您需要将节点重新连接到pgpool。

$ pcp_attach_node 10 pgpool_host 9898 admin _pcp_passwd_ 0

最后一个参数是节点ID,对于你的情况,它是0。

请参阅http://www.pgpool.net/docs/latest/pgpool-en.html#pcp_attach_node详细信息。

答案 1 :(得分:0)

在升级之前,您必须将 up 从属节点。这意味着,在您的情况下,使用Slony完全故障转移并重建前Master作为新Slave。

基本问题是写入新主服务器的写入必须先复制到旧主服务器,然后才能进行故障恢复。这首先是一个Slony问题。在验证Slony正在运行并且所有内容都已复制之后,您可以对pgpool端进行故障排除,但直到那时(然后您可能需要将其重新连接到PGPool)。在主/从模式下使用PGPool时,PGPool是您正在使用的任何其他复制系统的辅助。