我正在运行两个wildfly-10.1.0.Final实例在不同的服务器上(都在同一集群中)。也是执行远程jndi查找的客户端Java进程。如果一台服务器的网络出现故障,我在故障转移方面会遇到问题。我需要在jndi客户端进行识别的解决方案。
服务器1和服务器2的配置:
<subsystem xmlns="urn:jboss:domain:jgroups:4.0">
<channels default="ee">
<channel name="ee" stack="udp"/>
</channels>
<stacks>
<stack name="udp">
<transport type="UDP" socket-binding="jgroups-udp"/>
<protocol type="PING"/>
<protocol type="MERGE3"/>
<protocol type="FD_ALL">
<property name="timeout">
6000
</property>
<property name="interval">
2000
</property>
<property name="timeout_check_interval">
1000
</property>
</protocol>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="UFC"/>
<protocol type="MFC"/>
<protocol type="FRAG2"/>
</stack>
<stack name="tcp">
.......
.......
</stack>
</stacks>
示例程序
import com.test.*;
import javax.naming.InitialContext;
import java.util.*;
import java.util.concurrent.atomic.*;
import javax.naming.*;
import org.jboss.ejb.client.remoting.*;
import org.jboss.ejb.client.*;
import org.apache.log4j.*;
import org.jboss.ejb.client.EJBClientContext;
import org.jboss.ejb.client.EJBClientInterceptor;
import org.jboss.ejb.client.EJBClientInvocationContext;
import org.jboss.ejb.client.DeploymentNodeSelector;
import org.jboss.ejb.client.ClusterNodeSelector;
public class EJBClientTest implements DeploymentNodeSelector, ClusterNodeSelector
{
final static Logger logger = Logger.getLogger(EJBClientTest.class.getName());
private int serverIndex = 0;
private AtomicInteger clusterNode;
public EJBClientTest()
{
clusterNode = new AtomicInteger(0);
}
@Override
public String selectNode(String clusterName, String[] connectedNodes, String[] availableNodes)
{
String selectedNode = null;
logger.info("[ClusterNodeSelector] " + clusterName + " Connected Nodes=" + Arrays.asList(connectedNodes).toString());
logger.info("[ClusterNodeSelector] Available Nodes=" + Arrays.asList(availableNodes).toString());
if (availableNodes.length < 2)
{
selectedNode = availableNodes[0];
}
else
{
selectedNode = availableNodes[clusterNode.getAndIncrement() % availableNodes.length];
}
logger.info("[ClusterNodeSelector] Selected Node: " + selectedNode);
return selectedNode;
}
@Override
public String selectNode(String[] eligibleNodes, String appName, String moduleName, String distinctName)
{
logger.info("[DeploymentNodeSelector] EligibleNodes nodes = " + Arrays.toString(eligibleNodes));
String selectedNode = eligibleNodes[serverIndex++ % eligibleNodes.length];
logger.info("[DeploymentNodeSelector] Selected node: " + selectedNode);
return selectedNode;
}
public static void main(String[] args)
{
long stime = 0;
String username = "user", password = "pass";
try
{
final Properties appProp = new Properties();
appProp.put("endpoint.name", "client-endpoint");
appProp.put("invocation.timeout", "2000");
appProp.put("remote.connections", "node1,node2");
appProp.put("remote.connection.node1.host", "131.10.30.121");
appProp.put("remote.connection.node1.port", "8080");
appProp.put("remote.connection.node1.username", username);
appProp.put("remote.connection.node1.password", password);
appProp.put("remote.connection.node2.host", "131.10.30.122");
appProp.put("remote.connection.node2.port", "8080");
appProp.put("remote.connection.node2.username", username);
appProp.put("remote.connection.node2.password", password);
appProp.put("remote.clusters", "ejb");
appProp.put("remote.cluster.ejb.username", username);
appProp.put("remote.cluster.ejb.password", password);
appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_POLICY_NOANONYMOUS", "false");
appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_POLICY_NOPLAINTEXT", "false");
appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SSL_ENABLED", "false");
appProp.put("remote.connectionprovider.create.options.org.xnio.Options.SSL_ENABLED", "false");
appProp.put("remote.cluster.ejb.clusternode.selector", EJBClientTest.class.getName());
appProp.put("deployment.node.selector", EJBClientTest.class.getName());
appProp.put("remote.cluster.ejb.channel.options.org.jboss.remoting3.RemotingOptions.HEARTBEAT_INTERVAL", "2000");
appProp.put("remote.cluster.ejb.connect.options.org.xnio.Options.SASL_DISALLOWED_MECHANISMS", "JBOSS-LOCAL-USER");
appProp.put("org.jboss.ejb.client.scoped.context", true);
logger.info("Properties="+appProp.toString());
PropertiesBasedEJBClientConfiguration configuration = new PropertiesBasedEJBClientConfiguration(appProp);
ContextSelector<EJBClientContext> ejbClientContextSelector = new ConfigBasedEJBClientContextSelector(configuration);
EJBClientContext.setSelector(ejbClientContextSelector);
Properties jndiProperties = new Properties();
jndiProperties.put(Context.URL_PKG_PREFIXES, "org.jboss.ejb.client.naming");
final InitialContext ic = new InitialContext(jndiProperties);
while(true)
{
new Thread(new Runnable()
{
public void run()
{
try
{
long stime = System.currentTimeMillis();
TestSF o = (TestSF)ic.lookup("ejb:test/test-ejb/TestSFBean!com.test.TestSFRemote");
Vector gws1 = o.retrieveAll("key", "131.10.30.126");
logger.info("Result size = " + gws1.size() + " in " + (System.currentTimeMillis() - stime));
}
catch(Exception e1)
{
logger.error("Request Failed:===" + e1.getMessage());
}
}
}).start();
Thread.sleep(1000);
}
}
catch(Exception e)
{
e.printStackTrace();
logger.info("Exception:===" + e.getMessage());
}
}
}
Wildfly日志(Node1)用于集群形成,
11:46:28,132 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel server: [Node1|1] (2) [Node1, Node2]
11:46:28,133 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel web: [Node1|1] (2) [Node1, Node2]
11:46:28,135 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel hibernate: [Node1|1] (2) [Node1, Node2]
11:46:28,138 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-16,ee,Node1) ISPN000094: Received new cluster view for channel ejb: [Node1|1] (2) [Node1, Node2]
11:46:32,037 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t1) ISPN000310: Starting cluster-wide rebalance for cache client-mappings, topology CacheTopology{id=2, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Node1: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Node1: 132, Node2: 124]}, unionCH=null, actualMembers=[Node1, Node2]}
11:46:32,047 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t1) [Context=client-mappings][Scope=Node1]ISPN100002: Started local rebalance
11:46:32,066 INFO [org.infinispan.CLUSTER] (transport-thread--p14-t7) [Context=client-mappings][Scope=Node1]ISPN100003: Finished local rebalance
11:46:32,166 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t1) [Context=client-mappings][Scope=Node2]ISPN100003: Finished local rebalance
11:46:32,167 INFO [org.infinispan.CLUSTER] (remote-thread--p5-t1) ISPN000336: Finished cluster-wide rebalance for cache client-mappings, topology id = 2
Java客户端JNDI查找日志
2019-01-23 11:47:47,159 INFO [EJBClientTest] (Thread-15) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:47,159 INFO [EJBClientTest] (Thread-15) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:47:47,159 DEBUG [EJBClientContext] (Thread-15) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:47,167 INFO [EJBClientTest] (Thread-15) Result size = 3 in 9
2019-01-23 11:47:48,159 INFO [EJBClientTest] (Thread-16) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:48,159 INFO [EJBClientTest] (Thread-16) [DeploymentNodeSelector] Selected node: Node1
2019-01-23 11:47:48,159 DEBUG [EJBClientContext] (Thread-16) EJBClientTest@4699dd4c deployment node selector selected Node1 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:48,172 INFO [EJBClientTest] (Thread-16) Result size = 3 in 13
2019-01-23 11:47:49,160 INFO [EJBClientTest] (Thread-17) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:47:49,160 INFO [EJBClientTest] (Thread-17) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:47:49,160 DEBUG [EJBClientContext] (Thread-17) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:47:48,172 INFO [EJBClientTest] (Thread-16) Result size = 3 in 13
一段时间后,Node2服务器退出网络。
Wildfly日志(Node1)用于检测网络故障:
11:47:59,233 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel server: [Node1|2] (1) [Node1]
11:47:59,234 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel web: [Node1|2] (1) [Node1]
11:47:59,235 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel hibernate: [Node1|2] (1) [Node1]
11:47:59,236 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-5,ee,Node1) ISPN000094: Received new cluster view for channel ejb: [Node1|2] (1) [Node1]
11:47:59,257 WARN [org.infinispan.CLUSTER] (transport-thread--p14-t12) [Context=client-mappings]ISPN000314: Lost at least half of the stable members, possible split brain causing data inconsistency. Current members are [Node1], lost members are [Node2], stable members are [Node1, Node2]
即使在服务器端检测到故障之后,仍然无法在客户端上删除Node2。
2019-01-23 11:48:59,193 INFO [EJBClientTest] (Thread-87) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:48:59,194 INFO [EJBClientTest] (Thread-87) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:48:59,194 DEBUG [EJBClientContext] (Thread-87) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:49:00,194 INFO [EJBClientTest] (Thread-88) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:49:00,194 INFO [EJBClientTest] (Thread-88) [DeploymentNodeSelector] Selected node: Node1
2019-01-23 11:49:00,194 DEBUG [EJBClientContext] (Thread-88) EJBClientTest@4699dd4c deployment node selector selected Node1 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:49:00,201 INFO [EJBClientTest] (Thread-88) Result size = 3 in 8
2019-01-23 11:49:01,194 INFO [EJBClientTest] (Thread-89) [DeploymentNodeSelector] EligibleNodes nodes = [Node2, Node1]
2019-01-23 11:49:01,194 INFO [EJBClientTest] (Thread-89) [DeploymentNodeSelector] Selected node: Node2
2019-01-23 11:49:01,194 DEBUG [EJBClientContext] (Thread-89) EJBClientTest@4699dd4c deployment node selector selected Node2 node for appname=test,modulename=test-ejb,distinctname=
2019-01-23 11:49:01,195 ERROR [EJBClientTest] (Thread-87) Request Failed:===java.util.concurrent.TimeoutException: No invocation response received in 2000 milliseconds
2019-01-23 11:49:03,196 ERROR [EJBClientTest] (Thread-89) Request Failed:===java.util.concurrent.TimeoutException: No invocation response received in 2000 milliseconds
需要帮助来解决未知的问题,