从1.1升级到rancher 1.3后,我们在运行mongodb集群时遇到问题。 突然没有理由的牧场主不断重启mogodb集群的至少一个节点,声称它不完整。 下面你可以找到一个牧场主日志的片段(首先看一下作为逆序日志的结尾):
01:19:17 PM INFO service.trigger.info Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0
01:19:17 PM INFO service.trigger.info Service already reconciled
01:19:16 PM INFO service.trigger Re-evaluating state
01:19:16 PM INFO service.trigger (1 sec) Re-evaluating state
01:19:16 PM INFO service.trigger.info Service reconciled: Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0
01:19:16 PM INFO service.update.info Service already reconciled
01:19:16 PM INFO service.update Updating service
01:19:16 PM INFO service.update.info Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 0
01:19:16 PM INFO service.trigger.exception Busy processing [SERVICE.280] will try later
01:19:03 PM INFO service.update Updating service
01:19:03 PM INFO service.update.exception Busy processing [SERVICE.280] will try later
01:19:02 PM INFO service.trigger.wait (14 sec) Waiting for instances to start
01:19:02 PM INFO service.instance.create Creating extra service instance
01:19:02 PM INFO service.instance.create Creating extra service instance
01:19:01 PM INFO service.trigger (15 sec) Re-evaluating state
01:19:01 PM INFO service.trigger.info Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete: 1
问题始终以Requested: 3, Created: 3, Unhealthy: 0, Bad: 0, Incomplete:1
然而,在同一时间在蒙戈没有任何有趣的事情发生,突然它被外部的东西重新开始,即牧场主(登录自然顺序):
2017-01-22T13:06:11.957+0000 I NETWORK [conn2362] end connection 10.42.191.72:55615 (24 connections now open)
2017-01-22T13:06:14.848+0000 I NETWORK [initandlisten] connection accepted from 10.42.191.72:55635 #2363 (25 connections now open)
2017-01-22T13:06:14.849+0000 I NETWORK [conn2363] end connection 10.42.191.72:55635 (24 connections now open)
(nothing unusual until here, look here)->
2017-01-22T13:06:15.243+0000 I CONTROL [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
2017-01-22T13:06:15.244+0000 I FTDC [signalProcessingThread] Shutting down full-time diagnostic data capture
2017-01-22T13:06:15.253+0000 I REPL [signalProcessingThread] Stopping replication applier threads
2017-01-22T13:06:15.556+0000 I STORAGE [conn105] got request after shutdown()
2017-01-22T13:06:15.871+0000 I STORAGE [conn91] got request after shutdown()
2017-01-22T13:06:15.874+0000 I STORAGE [conn86] got request after shutdown()
2017-01-22T13:06:15.887+0000 I STORAGE [conn82] got request after shutdown()
2017-01-22T13:06:15.941+0000 I STORAGE [conn83] got request after shutdown()
2017-01-22T13:06:16.009+0000 I STORAGE [conn85] got request after shutdown()
2017-01-22T13:06:16.020+0000 I STORAGE [conn84] got request after shutdown()
2017-01-22T13:06:16.108+0000 I STORAGE [conn75] got request after shutdown()
2017-01-22T13:06:16.133+0000 I STORAGE [conn87] got request after shutdown()
知道什么可能是牧场主的问题。我甚至尝试创建没有客户的干净mongodb同样的故事由牧场主每小时重新启动两次,有时更频繁。 任何解决方法?