通信Infinispan远程异常会生成过多的网络流量

时间:2019-12-08 15:08:55

标签: keycloak infinispan

当我们的Infinispan集群(版本9.4.8.Final)中发生异常时,具有异常的节点会将此信息发送到集群中的其他节点。这似乎是设计使然。

此活动可能会导致大量流量,从而导致超时异常,进而使节点希望将其超时异常传达给其他节点。在生产中,我们的3节点Infinispan群集完全饱和了20 Gb / s的链路。

例如,在2节点质量检查群集中,我们观察到以下内容:

节点1:

ISPN000476: Timed out waiting for responses for request 7861 from node2

节点2:

ISPN000217: Received exception from node1, see cause for remote stack trace

进一步打印在节点2上的堆栈跟踪中,我们看到:

Timed out waiting for responses for request 7861 from node2

其中有很多。在此期间,我们进行了数据包捕获,可以看到有50 KB的数据包,其中包含远程错误列表以及它们的整个java堆栈跟踪。

发生这种情况时,这是一场“完美的风暴”。每次超时都会产生一个错误,该错误会通过网络发送。这会增加拥塞和超时。从那里开始,情况恶化得很快。

我知道我需要解决超时问题-寻找GC收集暂停等,并可能考虑增加超时。但是,我想知道在这些事件发生时是否有办法阻止这种行为的发生。当您考虑一下时,节点1与节点2对话超时,然后通过网络向节点2发送错误副本,告诉它“我超时与您对话”,这似乎很奇怪。

有什么方法可以避免传输这些远程堆栈跟踪信息?非常感谢您的任何见解或建议。

编辑

示例堆栈跟踪:

2019-12-06 11:37:01,587 ERROR [org.keycloak.services.error.KeycloakErrorHandler] (default task-26) Uncaught server error: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from ********, see cause for remote stack trace
        at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
        at org.infinispan.remoting.transport.ValidSingleResponseCollector.withException(ValidSingleResponseCollector.java:37)
        at org.infinispan.remoting.transport.ValidSingleResponseCollector.addResponse(ValidSingleResponseCollector.java:21)
        at org.infinispan.remoting.transport.impl.SingleTargetRequest.receiveResponse(SingleTargetRequest.java:52)
        at org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:35)
        at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1372)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1275)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1420)
        at org.jgroups.JChannel.up(JChannel.java:816)
        at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:133)
        at org.jgroups.stack.Protocol.up(Protocol.java:340)
        at org.jgroups.protocols.FORK.up(FORK.java:141)
        at org.jgroups.protocols.FRAG3.up(FRAG3.java:171)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
        at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
        at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
        at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
        at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
        at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
        at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
        at org.jgroups.protocols.Discovery.up(Discovery.java:295)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1249)
        at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.jboss.as.clustering.jgroups.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:52)
        at java.lang.Thread.run(Thread.java:745)
        Suppressed: org.infinispan.util.logging.TraceException
                at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:41)
                at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
                at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1918)
                at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1433)
                at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:685)
                at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:240)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:195)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.keycloak.cluster.infinispan.InfinispanNotificationsManager.notify(InfinispanNotificationsManager.java:155)
                at org.keycloak.cluster.infinispan.InfinispanClusterProvider.notify(InfinispanClusterProvider.java:130)
                at org.keycloak.models.cache.infinispan.CacheManager.sendInvalidationEvents(CacheManager.java:206)
                at org.keycloak.models.cache.infinispan.UserCacheSession.runInvalidations(UserCacheSession.java:140)
                at org.keycloak.models.cache.infinispan.UserCacheSession$1.commit(UserCacheSession.java:152)
                at org.keycloak.services.DefaultKeycloakTransactionManager.commit(DefaultKeycloakTransactionManager.java:146)
                at org.keycloak.services.resources.admin.UsersResource.createUser(UsersResource.java:125)
                at sun.reflect.GeneratedMethodAccessor487.invoke(Unknown Source)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                at java.lang.reflect.Method.invoke(Method.java:498)
                at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:139)
                at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:510)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:400)
                at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$0(ResourceMethodInvoker.java:364)
                at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:366)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:338)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:137)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:100)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:439)
                at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:229)
                at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:135)
                at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
                at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:138)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:215)
                at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:227)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:791)
                at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
                at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:129)
                at org.keycloak.services.filters.KeycloakSessionServletFilter.doFilter(KeycloakSessionServletFilter.java:90)
                at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
                at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
                at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
                at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
                at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
                at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
                at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
                at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
                at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
                at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
                at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
                at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
                at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at org.wildfly.extension.undertow.deployment.GlobalRequestControllerHandler.handleRequest(GlobalRequestControllerHandler.java:68)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
                at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
                at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
                at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
                at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
                at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
                at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
                at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
                at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
                at io.undertow.server.Connectors.executeRootHandler(Connectors.java:364)
                at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
                at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
                at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
                at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
                at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
                ... 1 more
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7865 from ********
        at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
        at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
        at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        ... 1 more

2 个答案:

答案 0 :(得分:2)

我们能够解决导致网络流量激增的问题。详细信息如下。

tl;博士

我们从JGroups UDP堆栈切换到TCP堆栈,使用像我们这样的分布式缓存,ISPN提到的ISPN文档对于小型集群可能更有效。

再现问题

为重现此问题,我们执行了以下操作:

  • 配置Keycloak后台作业,该作业将清除ISPN缓存条目,使其每60秒运行一次,这样我们就不必等待作业每隔15分钟运行一次( standalone-ha.xml ):

    <scheduled-task-interval>60</scheduled-task-interval>

  • 生成大量的用户会话(我们使用了jmeter)。在我们的测试中,我们最终产生了大约100,000个会话。

  • 将Keycloak中的SSO空闲和最大时间TTL配置为非常短,以便所有会话都将过期(60秒)
  • 使用jmeter继续在系统上施加负载(这一部分很重要)

当清除缓存的作业将运行时,我们将看到网络流量泛滥(120 MB / s)。发生这种情况时,我们会在群集中的每个节点上看到大量的以下错误:

  

ISPN000476:超时,等待来自的请求7861的响应   节点2

     

ISPN000217:从节点1接收到异常,请参阅远程堆栈的原因   跟踪

专业提示:配置对文件存储的钝化以保留您的ISPN数据。关闭群集,然后将“ .dat”文件保存在其他位置。使用这些文件可以在两次测试之间立即恢复ISPN群集的状态。

解决问题

使用上述技术,我们能够根据需要重现该问题。因此,我们着手使用以下描述的方法来解决它。

更改JGroups堆栈以使用TCP

我们将JGroups堆栈从UDP更改为TCP,并且还配置了TCPPing以进行发现。在阅读以下指南中的TCP堆栈说明后,我们进行了此操作:

https://infinispan.org/docs/stable/titles/configuring/configuring.html#preconfigured_jgroups_stacks-configuring

特别是:

  

”使用TCP进行传输,使用UDP多播进行发现。适用于   较小的群集(100个节点以下),仅在使用分布式时   之所以进行缓存是因为TCP作为点对点协议比UDP协议更有效   协议。”

这一更改完全消除了我们的问题

standalone-ha.xml中的Wildfly 16配置如下:

<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
        <channels default="ee">
            <channel name="ee" stack="tcp" cluster="ejb"/>
        </channels>
        <stacks>
            <stack name="udp">
                <transport type="UDP" socket-binding="jgroups-udp"/>
                <protocol type="PING"/>
                <protocol type="MERGE3"/>
                <protocol type="FD_SOCK"/>
                <protocol type="FD_ALL"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK2"/>
                <protocol type="UNICAST3"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="UFC"/>
                <protocol type="MFC"/>
                <protocol type="FRAG3"/>
            </stack>
            <stack name="tcp">
                <transport type="TCP" socket-binding="jgroups-tcp"/>
                <socket-protocol type="TCPPING" socket-binding="jgroups-tcp">
                  <property name="initial_hosts">HOST-X[7600],HOST-Y[7600],HOST-Z[7600]</property>
                  <property name="port_range">1</property>
                </socket-protocol>
                <protocol type="MERGE3"/>
                <protocol type="FD_SOCK"/>
                <protocol type="FD_ALL"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK2"/>
                <protocol type="UNICAST3"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="MFC"/>
                <protocol type="FRAG3"/>
            </stack>
        </stacks>
    </subsystem>

调整JVM垃圾收集器参数

我们遵循了ISPN调整指南中的一些建议:

https://infinispan.org/docs/stable/titles/tuning/tuning.html

尤其是,我们从JDK 8的默认GC更改为使用CMS收集器。具体来说,我们的Wildfly服务器现在提供了以下JVM参数:

-Xms6144m -Xmx6144m -Xmn1536M -XX:MetaspaceSize=192M -XX:MaxMetaspaceSize=512m -Djava.net.preferIPv4Stack=true -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC

其他更改

我们对环境进行了其他更改:

  • iptables级别的阻止IP组播+ UDP(因为我们希望确保我们仅使用TCP)
  • 在网络级别配置带宽上限,以防止ISPN群集使网络饱和并影响使用同一链接的其他主机。

答案 1 :(得分:1)

否,在Infinispan 9.4.x中无法禁用堆栈跟踪的序列化

Infinispan 10.0.0.Final在异常响应中不包括堆栈跟踪,但这只是其他工作的副作用,我已经打开ISPN-11022来添加回远程堆栈跟踪。

请在问题中添加评论,并提供完整的堆栈跟踪示例。