Question

我们有一个2节点集群JBoss环境，可以在生产环境中正常工作。我们间歇性地遇到两个高速缓存停止相互通信的问题（我们没有看到与infinispan相关的任何日志，但我们注意到应用程序日志得出的结论是，高速缓存已停止相互通信）。为了解决这个问题，我们暂时关闭了1个节点。然后，几个小时后，另一个节点又恢复正常，它又开始正常工作。有时它没有，因此我们将其关闭一段时间，然后在下次启动它时起作用。非常随机的行为。
对我们来说，这似乎是间歇性的网络故障。因此，我们需要让网络团队参与进来。但是我不明白该告诉他们什么。
我的问题是：
需要检查哪些配置，以及如何检查它们以验证缓存是否能够相互通信。

我在standalone.xml中与缓存相关的设置是：

<property name="ehcache.multicast.address" value="x.x.x.21"/>

<subsystem xmlns="urn:jboss:domain:jgroups:1.1" default-stack="udp">
            <stack name="udp">
                <transport type="UDP" socket-binding="jgroups-udp"/>
                <protocol type="PING"/>
                <protocol type="MERGE3"/>
                <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
                <protocol type="FD"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK"/>
                <protocol type="UNICAST2"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="UFC"/>
                <protocol type="MFC"/>
                <protocol type="FRAG2"/>
                <protocol type="RSVP"/>
            </stack>
            <stack name="tcp">
                <transport type="TCP" socket-binding="jgroups-tcp"/>
                <protocol type="MPING" socket-binding="jgroups-mping"/>
                <protocol type="MERGE2"/>
                <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
                <protocol type="FD"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK"/>
                <protocol type="UNICAST2"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="UFC"/>
                <protocol type="MFC"/>
                <protocol type="FRAG2"/>
                <protocol type="RSVP"/>
            </stack>
        </subsystem>

 <socket-binding name="jgroups-mping" port="0" multicast-address="x.x.x.23" multicast-port="45700"/>
        <socket-binding name="jgroups-tcp" port="7600"/>
        <socket-binding name="jgroups-tcp-fd" port="57600"/>
        <socket-binding name="jgroups-udp" port="55200" multicast-address="x.x.x.24" multicast-port="45688"/>
        <socket-binding name="jgroups-udp-fd" port="54200"/>
        <socket-binding name="messaging" port="5445"/>
                <socket-binding name="messaging-group" port="0" multicast-address="x.x.x.22" multicast-port="${jboss.messaging.group.port:9876}"/>
                <socket-binding name="messaging-throughput" port="5455"/>
        <socket-binding name="modcluster" port="0" multicast-address="y.y.y.105" multicast-port="23364"/>```

请让我知道是否需要更多信息来澄清问题。谢谢。

2020年5月8日更新：启用的DEBUG日志记录在org.infinispan和org.jgroups上。在日志中找到此行： 07：17：13,928 FINE [STABLE]（OOB-20，shared = udp）my-host-52 / ejb：从my-host-51 / ejb接收摘要（digest = my-host-51 / ejb：[4（ 4）]）与我自己的摘要不匹配（my-host-52 / ejb：[0（0）]）：忽略摘要并重新初始化自己的摘要。与该问题相关是否有意义？我可以看到所有缓存的类似日志：hibernate / ejb / singleton。

Answer 1

没有日志，很难理解正在发生的事情。显然，将org.jgroups降低为DEBUG可能会生成过多的日志记录，但是它将提供一些基本信息。

JBoss EAP 6.4 Infinispan集群缓存网络问题

1 个答案: