无法在JNDI客户端识别网络故障

时间:2019-01-23 07:04:33

标签: java ejb wildfly jndi wildfly-10

我正在运行两个wildfly-10.1.0.Final实例在不同的服务器上(都在同一集群中)。也是执行远程jndi查找的客户端Java进程。如果一台服务器的网络出现故障,我在故障转移方面会遇到问题。我需要在jndi客户端进行识别的解决方案。

服务器1和服务器2的配置:

<subsystem xmlns="urn:jboss:domain:jgroups:4.0">
<channels default="ee">
    <channel name="ee" stack="udp"/>
</channels>
<stacks>
    <stack name="udp">
        <transport type="UDP" socket-binding="jgroups-udp"/>
        <protocol type="PING"/>
        <protocol type="MERGE3"/>
        <protocol type="FD_ALL">
            <property name="timeout">
                6000
            </property>
            <property name="interval">
                2000
            </property>
            <property name="timeout_check_interval">
                1000
            </property>
        </protocol>
        <protocol type="VERIFY_SUSPECT"/>
        <protocol type="pbcast.NAKACK2"/>
        <protocol type="UNICAST3"/>
        <protocol type="pbcast.STABLE"/>
        <protocol type="pbcast.GMS"/>
        <protocol type="UFC"/>
        <protocol type="MFC"/>
        <protocol type="FRAG2"/>
    </stack>
<stack name="tcp">  
    .......
    .......
</stack>
</stacks>

示例程序

import com.test.*;
import javax.naming.InitialContext;
import java.util.*;
import java.util.concurrent.atomic.*;
import javax.naming.*;
import org.jboss.ejb.client.remoting.*;
import org.jboss.ejb.client.*;
import org.apache.log4j.*;

import org.jboss.ejb.client.EJBClientContext;
import org.jboss.ejb.client.EJBClientInterceptor;
import org.jboss.ejb.client.EJBClientInvocationContext;

import org.jboss.ejb.client.DeploymentNodeSelector;
import org.jboss.ejb.client.ClusterNodeSelector;

public class EJBClientTest  implements DeploymentNodeSelector, ClusterNodeSelector
{

    final static Logger logger = Logger.getLogger(EJBClientTest.class.getName());

    private int serverIndex = 0;
    private AtomicInteger clusterNode;

    public EJBClientTest()
    {
        clusterNode = new AtomicInteger(0);
    }

    @Override
    public String selectNode(String clusterName, String[] connectedNodes, String[] availableNodes)
    {
        String selectedNode = null;
        logger.info("[ClusterNodeSelector] " + clusterName + " Connected Nodes=" + Arrays.asList(connectedNodes).toString());
        logger.info("[ClusterNodeSelector] Available Nodes=" + Arrays.asList(availableNodes).toString());
        if (availableNodes.length < 2)
        {
            selectedNode = availableNodes[0];
        }
        else
        {
            selectedNode = availableNodes[clusterNode.getAndIncrement() % availableNodes.length];
        }
        logger.info("[ClusterNodeSelector] Selected Node: " + selectedNode);
        return selectedNode;
    }

    @Override
    public String selectNode(String[] eligibleNodes, String appName, String moduleName, String distinctName)

    {
        logger.info("[DeploymentNodeSelector] EligibleNodes nodes = " + Arrays.toString(eligibleNodes));
        String selectedNode = eligibleNodes[serverIndex++ % eligibleNodes.length];
        logger.info("[DeploymentNodeSelector] Selected node: " + selectedNode);
        return selectedNode;
    }
    public static void main(String[] args)
    {
        long stime = 0;
        String username = "user", password = "pass";

        try
        {
            final Properties appProp = new Properties();
            appProp.put("endpoint.name", "client-endpoint");
            appProp.put("invocation.timeout", "2000");
            appProp.put("remote.connections", "node1,node2");

            appProp.put("remote.connection.node1.host", "131.10.30.121");
            appProp.put("remote.connection.node1.port", "8080");
            appProp.put("remote.connection.node1.username", username);
            appProp.put("remote.connection.node1.password", password);

            appProp.put("remote.connection.node2.host", "131.10.30.122");
            appProp.put("remote.connection.node2.port", "8080");
            appProp.put("remote.connection.node2.username", username);
            appProp.put("remote.connection.node2.password", password);

            appProp.put("remote.clusters", "ejb");
            appProp.put("remote.cluster.ejb.username", username);
            appProp.put("remote.cluster.ejb.password", password);
            appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_POLICY_NOANONYMOUS", "false");
            appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_POLICY_NOPLAINTEXT", "false");
            appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SSL_ENABLED", "false");
            appProp.put("remote.connectionprovider.create.options.org.xnio.Options.SSL_ENABLED", "false");
            appProp.put("remote.cluster.ejb.clusternode.selector", EJBClientTest.class.getName());
            appProp.put("deployment.node.selector", EJBClientTest.class.getName());
            appProp.put("remote.cluster.ejb.channel.options.org.jboss.remoting3.RemotingOptions.HEARTBEAT_INTERVAL", "2000");
            appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_DISALLOWED_MECHANISMS", "JBOSS-LOCAL-USER");
            appProp.put("org.jboss.ejb.client.scoped.context", true);

            logger.info("Properties="+appProp.toString());

            PropertiesBasedEJBClientConfiguration configuration = new PropertiesBasedEJBClientConfiguration(appProp);
            ContextSelector<EJBClientContext> ejbClientContextSelector = new ConfigBasedEJBClientContextSelector(configuration);
            EJBClientContext.setSelector(ejbClientContextSelector);

            Properties jndiProperties = new Properties();
            jndiProperties.put(Context.URL_PKG_PREFIXES, "org.jboss.ejb.client.naming");
            final InitialContext ic = new InitialContext(jndiProperties);
            while(true)
            {
                new Thread(new Runnable()
                {
                    public void run()
                    {
                        try
                        {
                            long stime = System.currentTimeMillis();
                            TestSF o = (TestSF)ic.lookup("ejb:test/test-ejb/TestSFBean!com.test.TestSFRemote");
                            Vector gws1 = o.retrieveAll("key", "131.10.30.126");
                            logger.info("Result size = " + gws1.size() + " in " + (System.currentTimeMillis() - stime));
                        }
                        catch(Exception e1)
                        {
                            logger.error("Request Failed:===" + e1.getMessage());
                        }
                    }
                }).start();

                Thread.sleep(1000);
            }

        }
        catch(Exception e)
        {
            e.printStackTrace();
            logger.info("Exception:===" + e.getMessage());
        }
    }

}

Wildfly日志(Node1)用于集群形成,

11:46:28,132 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel server: [Node1|1] (2) [Node1, Node2]
11:46:28,133 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel web: [Node1|1] (2) [Node1, Node2]
11:46:28,135 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel hibernate: [Node1|1] (2) [Node1, Node2]
11:46:28,138 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel ejb: [Node1|1] (2) [Node1, Node2]
11:46:32,037 INFO  [org.infinispan.CLUSTER] (remote-thread--p5-t1) ISPN000310: Starting cluster-wide rebalance for cache client-mappings, topology CacheTopology{id=2, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Node1: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Node1: 132, Node2: 124]}, unionCH=null, actualMembers=[Node1, Node2]}
11:46:32,047 INFO  [org.infinispan.CLUSTER] (remote-thread--p5-t1) [Context=client-mappings][Scope=Node1]ISPN100002: Started local rebalance
11:46:32,066 INFO  [org.infinispan.CLUSTER] (transport-thread--p14-t7) [Context=client-mappings][Scope=Node1]ISPN100003: Finished local rebalance
11:46:32,166 INFO  [org.infinispan.CLUSTER] (remote-thread--p5-t1) [Context=client-mappings][Scope=Node2]ISPN100003: Finished local rebalance
11:46:32,167 INFO  [org.infinispan.CLUSTER] (remote-thread--p5-t1) ISPN000336: Finished cluster-wide rebalance for cache client-mappings, topology id = 2

Java客户端JNDI查找日志

2019-01-23 11:47:47,159 INFO  [EJBClientTest] (Thread-15) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:47,159 INFO  [EJBClientTest] (Thread-15) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:47:47,159 DEBUG [EJBClientContext] (Thread-15) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:47,167 INFO  [EJBClientTest] (Thread-15) Result size = 3 in 9
2019-01-23 11:47:48,159 INFO  [EJBClientTest] (Thread-16) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:48,159 INFO  [EJBClientTest] (Thread-16) [DeploymentNodeSelector] Selected node: Node1
2019-01-23 11:47:48,159 DEBUG [EJBClientContext] (Thread-16) EJBClientTest@4699dd4c deployment node selector selected Node1 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:48,172 INFO  [EJBClientTest] (Thread-16) Result size = 3 in 13
2019-01-23 11:47:49,160 INFO  [EJBClientTest] (Thread-17) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:49,160 INFO  [EJBClientTest] (Thread-17) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:47:49,160 DEBUG [EJBClientContext] (Thread-17) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:48,172 INFO  [EJBClientTest] (Thread-16) Result size = 3 in 13

一段时间后,Node2服务器退出网络。

Wildfly日志(Node1)用于检测网络故障:

11:47:59,233 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel server: [Node1|2] (1) [Node1]
11:47:59,234 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel web: [Node1|2] (1) [Node1]
11:47:59,235 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel hibernate: [Node1|2] (1) [Node1]
11:47:59,236 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel ejb: [Node1|2] (1) [Node1]
11:47:59,257 WARN  [org.infinispan.CLUSTER] (transport-thread--p14-t12) [Context=client-mappings]ISPN000314: Lost at least half of the stable members, possible split brain causing data inconsistency. Current members are [Node1], lost members are [Node2], stable members are [Node1, Node2]

即使在服务器端检测到故障之后,仍然无法在客户端上删除Node2。

2019-01-23 11:48:59,193 INFO  [EJBClientTest] (Thread-87) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:48:59,194 INFO  [EJBClientTest] (Thread-87) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:48:59,194 DEBUG [EJBClientContext] (Thread-87) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:49:00,194 INFO  [EJBClientTest] (Thread-88) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:49:00,194 INFO  [EJBClientTest] (Thread-88) [DeploymentNodeSelector] Selected node: Node1
2019-01-23 11:49:00,194 DEBUG [EJBClientContext] (Thread-88) EJBClientTest@4699dd4c deployment node selector selected Node1 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:49:00,201 INFO  [EJBClientTest] (Thread-88) Result size = 3 in 8
2019-01-23 11:49:01,194 INFO  [EJBClientTest] (Thread-89) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:49:01,194 INFO  [EJBClientTest] (Thread-89) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:49:01,194 DEBUG [EJBClientContext] (Thread-89) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=

2019-01-23 11:49:01,195 ERROR [EJBClientTest] (Thread-87) Request Failed:===java.util.concurrent.TimeoutException: No invocation response received in 2000 milliseconds
2019-01-23 11:49:03,196 ERROR [EJBClientTest] (Thread-89) Request Failed:===java.util.concurrent.TimeoutException: No invocation response received in 2000 milliseconds

需要帮助来解决未知的问题,

  1. 如何在客户端上检测网络故障。
  2. 即使在客户端上检测到群集,也需要知道为什么不调用群集选择器的原因。 (每次调用部署节点选择器。即使我删除了属性deploy.node.selector)

0 个答案:

没有答案