Question

安装AEM-author之后，Mongo副本集似乎运行良好。我安装的AEM版本是6.2

所以我尝试通过以下方法检查自动故障转移功能。 1.停止当前主要的mongod实例 2.通过发出rs.status（）mongo命令检查Secondary是否成为Primary 3.并检查AEM-author

的logs / error.log

Mongo副本集似乎正确地进行了故障转移。但AEM作者因显示以下错误而被打破。

/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error__5.log:01.11.2016 12:36:06.386 *ERROR* [pool-44-thread-1] org.apache.sling.serviceusermapping.impl.ServiceUserMapperImpl cannot unregister ServiceUserMapped Mapping [serviceName=com.adobe.cq.social.cq-social-messaging, subServiceName=utility-reader, userName=communities-utility-reader]
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error__5.log:01.11.2016 12:36:06.386 *ERROR* [pool-44-thread-1] org.apache.sling.serviceusermapping.impl.ServiceUserMapperImpl cannot unregister ServiceUserMapped Mapping [serviceName=com.adobe.cq.social.cq-social-messaging, subServiceName=acl-manager, userName=communities-acl-manager]
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error__5.log:01.11.2016 12:36:06.964 *ERROR* [FelixDispatchQueue] org.apache.felix.http.jetty FrameworkEvent ERROR (org.osgi.framework.BundleException: Activator stop error in bundle org.apache.felix.http.jetty [36].)
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:27:59.516 *ERROR* [DocumentDiscoveryLiteService-BackgroundWorker-[2]] org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteService doRun: got an exception: com.mongodb.MongoTimeoutException: Timed out after 10000 ms while waiting for a server that matches {serverSelectors=[ReadPreferenceServerSelector{readPreference=primary}, LatencyMinimizingServerSelector{acceptableLatencyDifference=15 ms}]}. Client view of cluster state is {type=ReplicaSet, servers=[{address=172.18.8.248:27017, type=ReplicaSetArbiter, averageLatency=1.0 ms, state=Connected}, {address=SERVW0014:27017, type=Unknown, state=Connecting, exception={com.mongodb.MongoException$Network: Exception opening the socket}, caused by {java.net.SocketException: Connection reset}}, {address=SERVW0015:27017, type=ReplicaSetSecondary, averageLatency=1.3 ms, state=Connected}]
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.343 *ERROR* [DocumentNodeStore background read thread (2)] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore. (leaseEndTime: 1477974601170, leaseTime: 120000, leaseFailureMargin: 20000, lease check end time (leaseEndTime-leaseFailureMargin): 1477974581170, now: 1477974585328, remaining: -4158) Need to stop oak-core/DocumentNodeStoreService.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.343 *ERROR* [LeaseFailureHandler-Thread] org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService handleLeaseFailure: stopping oak-core...
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.422 *ERROR* [LeaseFailureHandler-Thread] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.422 *ERROR* [LeaseFailureHandler-Thread] org.apache.sling.discovery.oak [org.apache.sling.discovery.oak.OakDiscoveryService(256)] The updatedPropertyProvider method has thrown an exception (com.google.common.util.concurrent.ExecutionError: java.lang.AssertionError: This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.)
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.453 *ERROR* [LeaseFailureHandler-Thread] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.453 *ERROR* [LeaseFailureHandler-Thread] com.adobe.cq.social.cq-social-scf-impl [com.adobe.cq.social.scf.impl.SocialComponentFactoryManagerImpl(2527)] The unbindFactories method has thrown an exception (com.google.common.util.concurrent.ExecutionError: java.lang.AssertionError: This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.)
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.500 *ERROR* [LeaseFailureHandler-Thread] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.500 *ERROR* [LeaseFailureHandler-Thread] com.adobe.cq.dtm.impl.DTMJobsInitializer Could not obtain a resource resolver.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.625 *ERROR* [LeaseFailureHandler-Thread] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.
/home/vagrant/mounts/author1/aem/22_crx-quickstart/logs/error_6.log:01.11.2016 13:29:46.625 *ERROR* [LeaseFailureHandler-Thread] org.apache.sling.discovery.oak [org.apache.sling.discovery.oak.OakDiscoveryService(256)] The updatedPropertyProvider method has thrown an exception (com.google.common.util.concurrent.ExecutionError: java.lang.AssertionError: This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.)

我试图根据adobe论坛解决问题，但我无法解决问题。

http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manager.topic.html/forum__r93i-hi_friends_icam.html

有人可以帮助我，为什么会导致这个问题，让我知道如何解决这个问题？

此致

Answer 1

你的问题是你无法连接到新的主mongodb实例（至少不是在所需的时间内）..我建议为你的问题添加mongodb的标签，因为这个问题与mongodb有关有更多的用户了解mongodb而不是jackrabbit-oak .. 回到问题：您可以从运行jackrabbit oak实例的计算机上ping新的主节点吗？您的副本集需要多长时间才能选出新的主节点？如果长度超过10秒，则需要更改一些mongo db配置设置。你能发表rs.status（）的结果吗？

Answer 2

感谢您的评论和建议。

我自己解决了这个问题。我接近的方式可能是正确的。

这个问题是写在AEM中的MongoDBDriver的WriteConcern参数。我把mongi.uri改为跟随，所以这个问题就解决了。

-Doak.mongo.uri=mongodb://PrimaryHost:27017,SecondoryHost:27017/?replicaSet=rs0&readPreference=nearest
↓
-Doak.mongo.uri=mongodb://PrimaryHost:27017,SecondoryHost:27017/?replicaSet=rs0&readPreference=nearest&w=1&j=1

我忘了关于我的副本集成员的帖子。我们的副本集包括Primary，Secondory和Arbiter。

当我检查oak.jackrabit API时，MongoDiver的默认WriteConcern是＆＃34;多数＆＃34; https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/document/util/MongoConnection.html#getDefaultWriteConcern(com.mongodb.DB)

当副本集（排除Arbiter）的一个成员关闭时，AEM不能确认写操作，因为写操作不能传播给大多数成员。

当我将WriteConcern改为w = 1时，写入操作被确认，AEM仍能正常工作。

你觉得这样吗？你有什么顾虑吗？

AEM 6.2 Mongo副本集自动故障转移不起作用

2 个答案: