日志

时间:2018-04-23 12:04:57

标签: corosync

我们有一个2节点集群,IP地址为10.150.5.179(trpventus01)和10.150.5.180(trpventus02),以及10.5.150.181上的虚拟IP(ventusproxyVIP)。 这些节点是带有Corosync 2.3.4的Red Hat 7.2

我们在日志中收到'martin source'消息。发生这种情况是因为这个虚拟IP的网络掩码是错误的,它是/ 24并且必须是/ 16。这引发了网络问题,现在已经解决了。

但我们仍然在日志中看到这些消息。一切都运行正常,但通常在13:31左右(不是所有日子)这些行显示在我们的日志中。我们此时没有任何流程触发,因此我们不确定为什么会发生这种情况。

在这个时刻,13:31:11,流量从trventus01移动到trpventus02,在13:31:13,流量再次移回trpventus01节点。

Apr 17 13:31:06 trpventus01 corosync[2351]: [MAIN  ] Corosync main process was not scheduled for 1092.4456 ms (threshold is 800.0000 ms). Consider token timeout increase.
Apr 17 13:31:06 trpventus01 corosync[2351]: [TOTEM ] A processor failed, forming new configuration.
Apr 17 13:31:06 trpventus01 corosync[2351]: [TOTEM ] A new membership (10.150.5.179:2012) was formed. Members
Apr 17 13:31:06 trpventus01 corosync[2351]: [QUORUM] Members[2]: 1 2
Apr 17 13:31:06 trpventus01 corosync[2351]: [MAIN  ] Completed service synchronization, ready to provide service.
Apr 17 13:31:13 trpventus01 corosync[2351]: [MAIN  ] Corosync main process was not scheduled for 5102.9751 ms (threshold is 800.0000 ms). Consider token timeout increase.
Apr 17 13:31:13 trpventus01 corosync[2351]: [TOTEM ] A processor failed, forming new configuration.
Apr 17 13:31:13 trpventus01 corosync[2351]: [TOTEM ] A new membership (10.150.5.179:2020) was formed. Members joined: 2 left: 2
Apr 17 13:31:13 trpventus01 corosync[2351]: [TOTEM ] Failed to receive the leave message. failed: 2
Apr 17 13:31:13 trpventus01 cib[3029]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now lost (was member)
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: Our peer on the DC (trpventus02) is dead
Apr 17 13:31:13 trpventus01 cib[3029]:  notice: Removing trpventus02/2 from the membership list
Apr 17 13:31:13 trpventus01 cib[3029]:  notice: Purged 1 peers with id=2 and/or uname=trpventus02 from the membership cache
Apr 17 13:31:13 trpventus01 cib[3029]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now member (was (null))
Apr 17 13:31:13 trpventus01 corosync[2351]: [QUORUM] Members[2]: 1 2
Apr 17 13:31:13 trpventus01 corosync[2351]: [MAIN  ] Completed service synchronization, ready to provide service.
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now lost (was member)
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Removing all trpventus02 attributes for attrd_peer_change_cb
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Lost attribute writer trpventus02
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Removing trpventus02/2 from the membership list
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Purged 1 peers with id=2 and/or uname=trpventus02 from the membership cache
Apr 17 13:31:13 trpventus01 stonith-ng[3030]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now lost (was member)
Apr 17 13:31:13 trpventus01 stonith-ng[3030]:  notice: Removing trpventus02/2 from the membership list
Apr 17 13:31:13 trpventus01 stonith-ng[3030]:  notice: Purged 1 peers with id=2 and/or uname=trpventus02 from the membership cache
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now member (was (null))
Apr 17 13:31:13 trpventus01 stonith-ng[3030]:  notice: crm_update_peer_proc: Node trpventus02[2] - state is now member (was (null))
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_CRMD_STATUS_CALLBACK origin=peer_update_callback ]
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_election_count_vote ]
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Updating all attributes after cib_refresh_notify event
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Recorded attribute writer: trpventus02
Apr 17 13:31:13 trpventus01 attrd[3032]:  notice: Processing sync-response from trpventus02
Apr 17 13:31:13 trpventus01 IPaddr2(ventusproxyVIP)[2302]: INFO: IP status = ok, IP_CIP=
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: Operation ventusproxyVIP_stop_0: ok (node=trpventus01, call=102, rc=0, cib-update=595, confirmed=true)
Apr 17 13:31:13 trpventus01 IPaddr2(ventusproxyVIP)[2359]: INFO: Adding inet address 10.150.5.181/16 with broadcast address 10.150.255.255 to device eno16780032
Apr 17 13:31:13 trpventus01 IPaddr2(ventusproxyVIP)[2359]: INFO: Bringing device eno16780032 up
Apr 17 13:31:13 trpventus01 IPaddr2(ventusproxyVIP)[2359]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-10.150.5.181 eno16780032 10.150.5.181 auto not_used not_used
Apr 17 13:31:13 trpventus01 crmd[3035]:  notice: Operation ventusproxyVIP_start_0: ok (node=trpventus01, call=103, rc=0, cib-update=596, confirmed=true)
Apr 17 13:31:14 trpventus01 kernel: IPv4: martian source 10.150.5.181 from 10.150.5.181, on dev eno16780032
Apr 17 13:31:14 trpventus01 kernel: ll header: 00000000: ff ff ff ff ff ff 00 50 56 83 4e 49 08 06        .......PV.NI..
Apr 17 13:31:15 trpventus01 kernel: IPv4: martian source 10.150.5.181 from 10.150.5.181, on dev eno16780032
Apr 17 13:31:15 trpventus01 kernel: ll header: 00000000: ff ff ff ff ff ff 00 50 56 83 4e 49 08 06        .......PV.NI..
Apr 17 13:33:01 trpventus01 chronyd[1236]: Source 213.251.52.234 replaced with 193.145.15.15

这似乎发生了这样的事情:“Corosync主进程没有安排在1092.4456毫秒(阈值是800.0000毫秒)。考虑令牌超时增加”,在13:31:06后跟“Corosync主进程未安排对于5102.9751 ms(阈值为800.0000 ms)。考虑令牌超时增加“在13:31:11。

这可能是什么原因?任何帮助都将非常感激。

0 个答案:

没有答案