你能帮助我理解为什么其中一个ehcache节点错误地尝试连接到127.0.0.1吗?
我正在使用ehcache 2.8.3。我的一个节点在NAT模式下在VMWare下运行。因此,主机具有ip 192.168.10.1
(Windows 7),VMWare中的一个为192.168.10.128
(CentOS 6)。
我有以下ehcache config
<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=manual, rmiUrls=//192.168.10.128:51000/myCache1|//192.168.10.1:51000/myCache1"/>
<cacheManagerPeerListenerFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="hostName=0.0.0.0,port=51000,socketTimeoutMillis=2000"/>
<diskStore path="java.io.tmpdir"/>
<defaultCache
maxEntriesLocalHeap="10000"
eternal="false"
timeToIdleSeconds="120"
timeToLiveSeconds="120"
diskSpoolBufferSizeMB="30"
maxEntriesLocalDisk="10000000"
diskExpiryThreadIntervalSeconds="120"
memoryStoreEvictionPolicy="LRU"
statistics="false">
<persistence strategy="localTempSwap"/>
</defaultCache>
<cache name="myCache1"
maxEntriesLocalHeap="10000"
maxEntriesLocalDisk="10000"
eternal="false"
diskSpoolBufferSizeMB="20"
timeToIdleSeconds="300"
timeToLiveSeconds="600"
memoryStoreEvictionPolicy="LFU"
transactionalMode="off">
<persistence strategy="localTempSwap"/>
<cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
</cache>
从192.168.10.128
到192.168.10.1
的邮件已成功路由。但相反的方向并不奏效。登录192.168.10.1
2014-07-11 02:02:19.260 +0400 DEBUG Lookup URL //192.168.10.128:51000/myCache1
2014-07-11 02:02:20.262 +0400 DEBUG Lookup URL //192.168.10.1:51000/myCache1
2014-07-11 02:02:21.264 +0400 WARN Unable to send message to remote peer. Message was: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[na:1.7.0_60]
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[na:1.7.0_60]
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[na:1.7.0_60]
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129) ~[na:1.7.0_60]
at net.sf.ehcache.distribution.RMICachePeer_Stub.send(Unknown Source) ~[services.jar:1.1]
at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.writeReplicationQueue(RMIAsynchronousCacheReplicator.java:314) [services.jar:1.1]
at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.replicationThreadMain(RMIAsynchronousCacheReplicator.java:127) [services.jar:1.1]
at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.access$000(RMIAsynchronousCacheReplicator.java:58) [services.jar:1.1]
at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator$ReplicationThread.run(RMIAsynchronousCacheReplicator.java:389) [services.jar:1.1]
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method) ~[na:1.7.0_60]
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79) ~[na:1.7.0_60]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) ~[na:1.7.0_60]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) ~[na:1.7.0_60]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) ~[na:1.7.0_60]
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) ~[na:1.7.0_60]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.7.0_60]
at java.net.Socket.connect(Socket.java:579) ~[na:1.7.0_60]
at java.net.Socket.connect(Socket.java:528) ~[na:1.7.0_60]
at java.net.Socket.<init>(Socket.java:425) ~[na:1.7.0_60]
at java.net.Socket.<init>(Socket.java:208) ~[na:1.7.0_60]
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[na:1.7.0_60]
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147) ~[na:1.7.0_60]
at net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory.createSocket(ConfigurableRMIClientSocketFactory.java:71) ~[services.jar:1.1]
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ~[na:1.7.0_60]
... 8 common frames omitted
如果我不在配置文件中的任何地方,为什么要尝试连接到127.0.0.1?
我可以从192.168.10.1 telnet到192.168.10.128:51000。
我也尝试启用bootstraping并开始看到以下日志消息
2014-07-11 02:35:30.515 +0400 DEBUG cache peers: [RMICachePeer_Stub[UnicastRef2 [liveRef: [endpoint:[127.0.0.1:18405,net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory@7d0](remote),objID:[-43892557:1472247d06b:-7fff, -5287536613776006259]]]]]
2014-07-11 02:35:30.516 +0400 DEBUG Bootstrapping myCache1 from RMICachePeer_Stub[UnicastRef2 [liveRef: [endpoint:[127.0.0.1:18405,net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory@7d0](remote),objID:[-43892557:1472247d06b:-7fff, -5287536613776006259]]]]
为什么我认为我有同行127.0.0.1:18405
?
答案 0 :(得分:4)
经过JDK和ehcache源代码的多个小时调试后,我发现了它。
我的核心错误假设是我看到错误的Windows节点出了问题。原来是linux节点提供的地址不正确。
官方Ehcache常见问题says:
这是由于2008年对Ubuntu / Debian Linux默认更改造成的 网络配置。基本上,Java调用 InetAddress.getLocalHost();总是返回环回地址 是127.0.0.1。为什么?因为在这些最近的发行版中,系统调用了$ hostname始终返回映射到环回设备的地址, 这导致Ehcache的RMI对等创建逻辑始终分配 环回地址,导致您看到的错误。一切你需要的 要做的就是破解网络配置并确保主机名 机器返回其他人可访问的有效网络地址 网络上的同行。
Linux节点在类java.rmi.registry.LocateRegistry
public static Registry getRegistry(String host, int port, RMIClientSocketFactory csf) throws RemoteException
{
Registry registry = null;
if (port <= 0)
port = Registry.REGISTRY_PORT;
if (host == null || host.length() == 0) {
// If host is blank (as returned by "file:" URL in 1.0.2 used in
// java.rmi.Naming), try to convert to real local host name so
// that the RegistryImpl's checkAccess will not fail.
try {
host = java.net.InetAddress.getLocalHost().getHostAddress();
} catch (Exception e) {
// If that failed, at least try "" (localhost) anyway...
host = "";
}
}
LiveRef liveRef = new LiveRef(new ObjID(ObjID.REGISTRY_ID), new TCPEndpoint(host, port, csf, null), false);
RemoteRef ref = (csf == null) ? new UnicastRef(liveRef) : new UnicastRef2(liveRef);
return (Registry) Util.createProxy(RegistryImpl.class, ref, false);
}
我的Windows节点正在使用以下net.sf.ehcache.distribution.ManualRMICacheManagerPeerProvider
类方法接收它,并调用lookupRemoteCachePeer
public final synchronized List listRemoteCachePeers(Ehcache cache) throws CacheException {
List remoteCachePeers = new ArrayList();
List staleList = new ArrayList();
for (Iterator iterator = peerUrls.keySet().iterator(); iterator.hasNext();) {
String rmiUrl = (String) iterator.next();
String rmiUrlCacheName = extractCacheName(rmiUrl);
if (!rmiUrlCacheName.equals(cache.getName())) {
continue;
}
Date date = (Date) peerUrls.get(rmiUrl);
if (!stale(date)) {
CachePeer cachePeer = null;
try {
cachePeer = lookupRemoteCachePeer(rmiUrl);
remoteCachePeers.add(cachePeer);
} catch (Exception e) {
if (LOG.isDebugEnabled()) {
LOG.debug("Looking up rmiUrl " + rmiUrl + " through exception " + e.getMessage()
+ ". This may be normal if a node has gone offline. Or it may indicate network connectivity"
+ " difficulties", e);
}
}
} else {
LOG.debug("rmiUrl {} should never be stale for a manually configured cluster.", rmiUrl);
staleList.add(rmiUrl);
}
}
//Remove any stale remote peers. Must be done here to avoid concurrent modification exception.
for (int i = 0; i < staleList.size(); i++) {
String rmiUrl = (String) staleList.get(i);
peerUrls.remove(rmiUrl);
}
return remoteCachePeers;
}
Terracotta的官方建议是修改hosts
文件,这对我来说太残酷了。我的结论是,Ops团队在我的服务器的命令行中提供正确的绑定地址会更容易,看起来像这样
java -Djava.rmi.server.hostname=192.168.10.128 -jar services.jar