kubernetes中的redis-ha无法故障转移回master

时间:2018-12-12 12:27:05

标签: redis kubernetes redis-ha

我正在尝试创建一个简单的Redis高可用性设置,其中包含1个主设备,1个从设备和2个标记。

redis-masterredis-slave故障转移时,该设置可以完美地工作。  redis-master恢复后,它会正确地将自己注册为新redis-slave主服务器的从服务器。

但是,当redis-slave作为主服务器出现故障时,redis-master无法作为主服务器返回。 redis-master的日志进入循环,显示:

1:S 12 Dec 11:12:35.073 * MASTER <-> SLAVE sync started
1:S 12 Dec 11:12:35.073 * Non blocking connect for SYNC fired the event.
1:S 12 Dec 11:12:35.074 * Master replied to PING, replication can continue...
1:S 12 Dec 11:12:35.075 * Trying a partial resynchronization (request 684581a36d134a6d50f1cea32820004a5ccf3b2d:285273).
1:S 12 Dec 11:12:35.076 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 12 Dec 11:12:36.081 * Connecting to MASTER 10.102.1.92:6379
1:S 12 Dec 11:12:36.081 * MASTER <-> SLAVE sync started
1:S 12 Dec 11:12:36.082 * Non blocking connect for SYNC fired the event.
1:S 12 Dec 11:12:36.082 * Master replied to PING, replication can continue...
1:S 12 Dec 11:12:36.083 * Trying a partial resynchronization (request 684581a36d134a6d50f1cea32820004a5ccf3b2d:285273).
1:S 12 Dec 11:12:36.084 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 12 Dec 11:12:37.087 * Connecting to MASTER 10.102.1.92:6379
1:S 12 Dec 11:12:37.088 * MASTER <-> SLAVE sync started
...

每个Replication doc都指出:

  

自Redis 4.0起,实例在升级后升为主节点   故障转移,它将仍然能够执行部分重新同步   与旧主人的奴隶。

但是日志似乎显示为其他。显示第一个redis-masterredis-slave故障转移以及随后的redis-slaveredis-master日志的更详细的日志版本here可用。

知道发生了什么吗?我该怎么做才能使redis-master返回主角色?配置详细信息如下:

服务

NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
redis-master     ClusterIP   10.102.1.92     <none>        6379/TCP    11m
redis-slave      ClusterIP   10.107.0.73     <none>        6379/TCP    11m
redis-sentinel   ClusterIP   10.110.128.95   <none>        26379/TCP   11m

redis-master配置

requirepass test1234
masterauth test1234
dir /data

tcp-keepalive 60
maxmemory-policy noeviction
appendonly no
bind 0.0.0.0
save 900 1
save 300 10
save 60 10000

slave-announce-ip redis-master.fp8-cache
slave-announce-port 6379

redis-slave配置

requirepass test1234
slaveof redis-master.fp8-cache 6379
masterauth test1234
dir /data

tcp-keepalive 60
maxmemory-policy noeviction
appendonly no
bind 0.0.0.0
save 900 1
save 300 10
save 60 10000

slave-announce-ip redis-slave.fp8-cache
slave-announce-port 6379

1 个答案:

答案 0 :(得分:0)

事实证明,问题与主机名而不是IP的使用有关:

slaveof redis-master.fp8-cache 6379
...
slave-announce-ip redis-slave.fp8-cache

因此,当主机作为从机返回时,哨兵显示现在有2个从机:一个具有ip地址,另一个具有主机名。不确定这两个从属条目(指向同一Redis服务器)是如何导致上述问题的。现在,我将配置更改为使用IP地址代替主机名,Redis HA可以正常工作。