Redis Multi哨兵失败后不选举新的Redis主机

时间:2020-01-15 05:54:17

标签: redis sentinel redis-cluster redis-sentinel

我有一个三节点redis和3节点哨兵,一切正常,所有主服务器和从属服务器都经过验证,并且哨兵配置文件已与所有redis和哨兵节点一起更新,但是问题是当redis主服务器关闭并且哨兵想要再次选择发生故障的主机,并且不会在其他从机之间选择新的主机,这是我的配置文件和日志。

vm1:redis主服务器和哨兵192.168.1.48

vm2:Redis从站和sentinel2 192.168.1.51

vm3:redis从站和sentinel3 192.168.1.52

Redis主配置文件:(vm1)

bind 192.168.1.48 127.0.0.1 
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 60
daemonize no
supervised systemd 
pidfile /var/run/redis_6379.pid
loglevel notice
logfile ""
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis 
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
requirepass 123456789
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes

哨兵配置之一:(vm1)

#1
bind 0.0.0.0
port 26379
#2
sentinel myid 7e09f70bc68cdc0afee3d8cd9bdf3fe6f320a3d5
sentinel deny-scripts-reconfig yes
sentinel monitor redis-cluster 192.168.1.48 6379 3
sentinel down-after-milliseconds redis-cluster 5000
#3
sentinel failover-timeout redis-cluster 1000
sentinel parallel-syncs redis-cluster 1

#misc
daemonize yes
pidfile "/var/run/redis_26379.pid"
logfile "/var/log/redis_26379.log"
dir "/var/lib/redis"

##############
# Generated by CONFIG REWRITE
protected-mode no
sentinel auth-pass redis-cluster 123456789
sentinel config-epoch redis-cluster 0
sentinel leader-epoch redis-cluster 52
sentinel known-replica redis-cluster 192.168.1.51 6379
sentinel known-replica redis-cluster 192.168.1.52 6379
sentinel known-sentinel redis-cluster 192.168.1.52 26379 0b37aa7287e89ad38a90a97cdff16c22793678a6
sentinel known-sentinel redis-cluster 192.168.1.51 26379 9d097bb22ffdf87c7f8a403a8dc82c989790cf3b
sentinel current-epoch 52

另一个Sentinel配置:(vm2)

#1
bind 0.0.0.0
port 26379
#2
sentinel myid 9d097bb22ffdf87c7f8a403a8dc82c989790cf3b

daemonize yes
pidfile "/var/run/redis_26379.pid"
sentinel deny-scripts-reconfig yes
sentinel monitor redis-cluster 192.168.1.48 6379 2
sentinel down-after-milliseconds redis-cluster 5000
sentinel failover-timeout redis-cluster 10000
#3
sentinel auth-pass redis-cluster 123456789
#misc
logfile "/var/log/redis_26379.log"
dir "/var/lib/redis"
# Generated by CONFIG REWRITE
protected-mode no
sentinel config-epoch redis-cluster 0
sentinel leader-epoch redis-cluster 28811
sentinel known-replica redis-cluster 192.168.1.51 6379
sentinel known-replica redis-cluster 192.168.1.52 6379
sentinel known-sentinel redis-cluster 192.168.1.52 26379 0b37aa7287e89ad38a90a97cdff16c22793678a6
sentinel known-sentinel redis-cluster 192.168.1.48 26379 7e09f70bc68cdc0afee3d8cd9bdf3fe6f320a3d5
sentinel current-epoch 28811

主服务器故障后的前哨日志:(vm2)

2692:X 15 Jan 2020 09:19:42.576 # +vote-for-leader 0b37aa7287e89ad38a90a97cdff16c22793678a6 28804
2692:X 15 Jan 2020 09:19:42.582 # Next failover delay: I will not start a failover before Wed Jan 15 09:20:02 2020
2692:X 15 Jan 2020 09:20:02.659 # +new-epoch 28805
2692:X 15 Jan 2020 09:20:02.660 # +try-failover master redis-cluster 192.168.1.48 6379
2692:X 15 Jan 2020 09:20:02.662 # +vote-for-leader 9d097bb22ffdf87c7f8a403a8dc82c989790cf3b 28805
2692:X 15 Jan 2020 09:20:02.674 # 0b37aa7287e89ad38a90a97cdff16c22793678a6 voted for 9d097bb22ffdf87c7f8a403a8dc82c989790cf3b 28805
2692:X 15 Jan 2020 09:20:02.745 # +elected-leader master redis-cluster 192.168.1.48 6379
2692:X 15 Jan 2020 09:20:02.745 # +failover-state-select-slave master redis-cluster 192.168.1.48 6379
2692:X 15 Jan 2020 09:20:02.846 # -failover-abort-no-good-slave master redis-cluster 192.168.1.48 6379
2692:X 15 Jan 2020 09:20:02.902 # Next failover delay: I will not start a failover before Wed Jan 15 09:20:22 2020

2 个答案:

答案 0 :(得分:1)

我知道这个答案有点晚了,但是,根据我的经验,这里有一些见解

问题出在从属优先级上。

取决于发生哪种主选举的条件很多。

  • 应该为redis.conf中的每个redis节点设置SLAVE / REPLICA PRIORITY,并且必须将其设为任何值才能使任何节点成为主节点(数字越小优先级越高)

  • 集群中的节点数应始终为奇数

  • 活动节点总数应为MAJORITY
  • 仲裁(在redis.conf中设置的阈值,指示节点数)应同意/同意主故障。
  • 应该至少有一个良好的结点(没有连续失败的不良历史记录等),这样每个人都可以同意成为主人。

一旦以上所有条件都满足,则按照优先级的顺序,节点将充当主节点。

PS:除了上述内容之外,redis还引入了一些额外的逻辑,使决策更加可靠和可靠 请找到更多信息HERE IN OFFICIAL DOCUMENTATION

答案 1 :(得分:0)

发生这种情况时,您可以连接到从机并运行命令信息并验证配置。