Rancher不断重启MongoDB

时间:2017-01-22 12:34:52

标签: mongodb rancher

从1.1升级到rancher 1.3后,我们在运行mongodb集群时遇到问题。 突然没有理由的牧场主不断重启mogodb集群的至少一个节点,声称它不完整。 下面你可以找到一个牧场主日志的片段(首先看一下作为逆序日志的结尾):

01:19:17 PM INFO    service.trigger.info    Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0    
01:19:17 PM INFO    service.trigger.info    Service already reconciled   
01:19:16 PM INFO    service.trigger Re-evaluating state  
01:19:16 PM INFO    service.trigger (1 sec) Re-evaluating state  
01:19:16 PM INFO    service.trigger.info    Service reconciled: Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0    
01:19:16 PM INFO    service.update.info Service already reconciled   
01:19:16 PM INFO    service.update  Updating service     
01:19:16 PM INFO    service.update.info Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0    
01:19:16 PM INFO    service.trigger.exception   Busy processing [SERVICE.280] will try later     
01:19:03 PM INFO    service.update  Updating service     
01:19:03 PM INFO    service.update.exception    Busy processing [SERVICE.280] will try later     
01:19:02 PM INFO    service.trigger.wait (14 sec)   Waiting for instances to start   
01:19:02 PM INFO    service.instance.create Creating extra service instance  
01:19:02 PM INFO    service.instance.create Creating extra service instance  
01:19:01 PM INFO    service.trigger (15 sec)    Re-evaluating state  
01:19:01 PM INFO    service.trigger.info    Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 1

问题始终以Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete:1

开头

然而,在同一时间在蒙戈没有任何有趣的事情发生,突然它被外部的东西重新开始,即牧场主(登录自然顺序):

2017-01-22T13:06:11.957+0000 I NETWORK  [conn2362] end connection 10.42.191.72:55615 (24 connections now open)
2017-01-22T13:06:14.848+0000 I NETWORK  [initandlisten] connection accepted from 10.42.191.72:55635 #2363 (25 connections now open)
2017-01-22T13:06:14.849+0000 I NETWORK  [conn2363] end connection 10.42.191.72:55635 (24 connections now open)
(nothing unusual until here, look here)->
2017-01-22T13:06:15.243+0000 I CONTROL  [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
2017-01-22T13:06:15.244+0000 I FTDC     [signalProcessingThread] Shutting down full-time diagnostic data capture
2017-01-22T13:06:15.253+0000 I REPL     [signalProcessingThread] Stopping replication applier threads
2017-01-22T13:06:15.556+0000 I STORAGE  [conn105] got request after shutdown()
2017-01-22T13:06:15.871+0000 I STORAGE  [conn91] got request after shutdown()
2017-01-22T13:06:15.874+0000 I STORAGE  [conn86] got request after shutdown()
2017-01-22T13:06:15.887+0000 I STORAGE  [conn82] got request after shutdown()
2017-01-22T13:06:15.941+0000 I STORAGE  [conn83] got request after shutdown()
2017-01-22T13:06:16.009+0000 I STORAGE  [conn85] got request after shutdown()
2017-01-22T13:06:16.020+0000 I STORAGE  [conn84] got request after shutdown()
2017-01-22T13:06:16.108+0000 I STORAGE  [conn75] got request after shutdown()
2017-01-22T13:06:16.133+0000 I STORAGE  [conn87] got request after shutdown()

知道什么可能是牧场主的问题。我甚至尝试创建没有客户的干净mongodb同样的故事由牧场主每小时重新启动两次,有时更频繁。 任何解决方法?

0 个答案:

没有答案