会发生什么? 当我开始keepalived一切正常。当node01失败并且它无法再启动postgresql时,它将不断尝试强制进行选举。即使postgresql无法启动。选举现在每秒都会发生。
我想要实现的目标 它应该检查当node02是主节点时是否可以在node01上启动postgresql,但是不能一直强制选举。有人可以尝试帮助并使其正确吗?
这是我的代码
停止pgsql的:
#!/usr/bin/python
import sys
import subprocess
sys.exit(
subprocess.call(['/usr/bin/systemctl', 'stop', 'postgresql.service'])
)
通知:
#!/usr/bin/python
import sys
import subprocess
state = sys.argv[3]
with open('/var/run/keepalived.pgsql.state', 'w+') as f:
f.write(state)
if state == 'MASTER':
sys.exit(
subprocess.call(['/usr/bin/systemctl', 'start', 'postgresql.service'])
)
if state == 'BACKUP':
sys.exit(
subprocess.call(['/usr/bin/systemctl', 'stop', 'postgresql.service'])
)
if state == 'FAULT':
sys.exit(
subprocess.call(['/usr/bin/systemctl', 'stop', 'postgresql.service'])
)
签的pgsql:
#!/usr/bin/python
import sys
import subprocess
from time import sleep
sleep(1)
with open('/var/run/keepalived.pgsql.state', 'r') as f:
state = f.read().strip().strip("\n")
# status 0: Postgresql is running
# status 3: Postgresql has been stopped
status = subprocess.call(['/usr/bin/systemctl', 'status', 'postgresql.service'])
if status == 0 and state == 'MASTER':
sys.exit(0)
if status == 0 and state == 'BACKUP':
sys.exit(3)
if status == 3 and state == 'MASTER':
sys.exit(3)
if status == 3 and state == 'BACKUP':
sys.exit(0)
keepalived config:
vrrp_script chk_pgsql {
script "/etc/keepalived/check-pgsql"
interval 1
fall 3
rise 3
weight -4
}
vrrp_instance pgsql_vip {
state EQUAL
interface eth0
virtual_router_id 4
priority 100(node01)|99{node02}
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_pgsql
}
virtual_ipaddress {
192.168.1.20
}
notify "/etc/keepalived/notify"
notify_stop "/etc/keepalived/stop"
}
答案 0 :(得分:0)
node01死后,node02获得当选主服务器。然后,check01将检查node01。脚本看到node01现在处于BACKUP状态并且posgresql已停止,并返回0.在检查脚本返回0 3次后(根据您的VRRP配置),node01认为它是正常的。然后,由于node01具有比node02更高的优先级,因此它通过选举过程来控制。然后检查脚本失败,因为node01处于MASTER状态并且posgresql已停止。这会导致keepalived在节点之间开始抖动。
我认为您可以通过以下两种方式解决此问题: