redis master没有看到奴隶

时间:2018-01-30 23:01:55

标签: docker redis redis-sentinel

我的redis sentinel设置有问题,它有4个节点(1个主节点,3个从节点)。我修补了第一个从节点(docker版本从17.03.1-ce更改为17.12.0-ce)。我的问题是master不再将slave node1带到成员池。

Slave(node1)info(它识别主节点):

$ docker exec -it redis-sentinel redis-cli info replication
 # Replication
 role:slave
 master_host:<master_ip>
 master_port:6379
 master_link_status:down

主信息:

$ docker exec -it redis-sentinel redis-cli info replication
# Replication
role:master
connected_slaves:2    
slave0:ip=<slave_2_ip>,port=6379,state=online,offset=191580670534,lag=0   
slave1:ip=<slave_3_ip>,port=6379,state=online,offset=191580666435,lag=0
master_repl_offset:191580672343

师父必须有3个奴隶。 node1上的主IP是正确的(修补了什么)。节点2,3,4泊坞窗版本为17.03.1-ce。当我在开发中测试相同的情况时 - 一切正常。您能否建议我需要做些什么来启用主节点1和从节点1之间的复制?

Docker重启后(@ node1)我看到类似的东西(msg =“未知容器”):

Jan 31 08:16:12 node1 dockerd[17288]: time="2018-01-31T08:16:12.150892519+02:00" level=warning msg="unknown container" container=23e48b7846bd325ba5af772217085b60708660f5f5d8bb6fefd23094235ac01f module=libcontainerd namespace=plugins.moby
Jan 31 08:16:12 node1 dockerd[17288]: time="2018-01-31T08:16:12.177513187+02:00" level=warning msg="unknown container" container=23e48b7846bd325ba5af772217085b60708660f5f5d8bb6fefd23094235ac01f module=libcontainerd namespace=plugins.moby

当我检查node4主日志时,我看到node1已转换为slave:

1:X 30 Jan 21:35:09.301 # +sdown sentinel 66f6a8950a72952ac7df18f6a653718445fad5db node1_slave 26379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:35:10.276 # +sdown slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:10.388 * +reboot slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:10.473 # -sdown slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:10.473 # -sdown sentinel 66f6a8950a72952ac7df18f6a653718445fad5db node1_slave 26379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:20.436 * +convert-to-slave slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:30.516 * +convert-to-slave slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 21:58:40.529 * +convert-to-slave slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 22:39:48.284 * +reboot slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 22:39:58.391 * +convert-to-slave slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379
1:X 30 Jan 22:40:08.447 * +convert-to-slave slave node1_slave:6379 node1_slave 6379 @ sentinel-xx node4_master 6379

另一方面,redis-client日志显示它无法将数据库保存在磁盘上。

$ docker logs --follow redis-client
1:M 31 Jan 07:47:09.451 * Slave node3_slave:6379 asks for synchronization
1:M 31 Jan 07:47:09.451 * Full resync requested by slave node3_slave:6379
1:M 31 Jan 07:47:09.451 * Starting BGSAVE for SYNC with target: disk
1:M 31 Jan 07:47:09.452 # Can't save in background: fork: Out of memory
1:M 31 Jan 07:47:09.452 # BGSAVE for replication failed
1:M 31 Jan 07:47:24.628 * Slave node1_slave:6379 asks for synchronization
1:M 31 Jan 07:47:24.628 * Full resync requested by slave node1_slave:6379
1:M 31 Jan 07:47:24.628 * Starting BGSAVE for SYNC with target: disk
1:M 31 Jan 07:47:24.628 # Can't save in background: fork: Out of memory
1:M 31 Jan 07:47:24.628 # BGSAVE for replication failed
1:M 31 Jan 07:48:10.560 * Slave node3_slave:6379 asks for synchronization
1:M 31 Jan 07:48:10.560 * Full resync requested by slave node3_slave:6379
1:M 31 Jan 07:48:10.560 * Starting BGSAVE for SYNC with target: disk

1 个答案:

答案 0 :(得分:0)

通过将vm.overcommit_memory切换为1来解决问题。

sysctl vm.overcommit_memory=1

感谢yanhan comment

现在日志就像那样:

1:M 31 Jan 07:48:10.560 * Slave node2_slave:6379 asks for synchronization
1:M 31 Jan 07:48:10.560 * Full resync requested by slave node2_slave:6379
1:M 31 Jan 07:48:10.560 * Starting BGSAVE for SYNC with target: disk
1:M 31 Jan 07:48:10.569 * Background saving started by pid 16
1:M 31 Jan 07:49:15.773 # Connection with slave client id #388090 lost.
1:M 31 Jan 07:49:16.219 # Connection with slave node2_slave:6379 lost.
1:M 31 Jan 07:49:25.394 * Slave node1_slave:6379 asks for synchronization
1:M 31 Jan 07:49:25.395 * Full resync requested by slave node1_slave:6379
1:M 31 Jan 07:49:25.395 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC
1:S 31 Jan 07:49:35.421 # Connection with slave node1_slave:6379 lost.
1:S 31 Jan 07:49:35.518 * SLAVE OF node2_slave:6379 enabled (user request from 'id=395598 addr=node2_slave:33026 fd=7 name=sentinel-52caa67d-cmd age=10 idle=0 flags=x db=0     sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:S 31 Jan 07:49:36.121 * Connecting to MASTER node2_slave:6379
1:S 31 Jan 07:49:36.122 * MASTER <-> SLAVE sync started
1:S 31 Jan 07:49:36.135 * Non blocking connect for SYNC fired the event.
1:S 31 Jan 07:49:36.138 * Master replied to PING, replication can continue...
1:S 31 Jan 07:49:36.147 * Partial resynchronization not possible (no cached master)
1:S 31 Jan 07:49:36.153 * Full resync from master: f15e28b26604bda49ad515b38cba2639ee8e13bc:191935552685
1:S 31 Jan 07:49:46.523 * MASTER <-> SLAVE sync: receiving 1351833877 bytes from master
1:S 31 Jan 07:49:57.888 * MASTER <-> SLAVE sync: Flushing old data
16:C 31 Jan 07:50:17.083 * DB saved on disk
16:C 31 Jan 07:50:17.114 * RDB: 3465 MB of memory used by copy-on-write
1:S 31 Jan 07:51:22.749 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 31 Jan 07:51:46.609 * MASTER <-> SLAVE sync: Finished with success
1:S 31 Jan 07:51:46.609 * Background saving terminated with success