其中一个ehcache节点错误地尝试连接到127.0.0.1

时间:2014-07-10 22:12:06

标签: java rmi ehcache

你能帮助我理解为什么其中一个ehcache节点错误地尝试连接到127.0.0.1吗?

我正在使用ehcache 2.8.3。我的一个节点在NAT模式下在VMWare下运行。因此,主机具有ip 192.168.10.1(Windows 7),VMWare中的一个为192.168.10.128(CentOS 6)。

我有以下ehcache config

<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
                                 properties="peerDiscovery=manual, rmiUrls=//192.168.10.128:51000/myCache1|//192.168.10.1:51000/myCache1"/>

<cacheManagerPeerListenerFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
                                 properties="hostName=0.0.0.0,port=51000,socketTimeoutMillis=2000"/>

<diskStore path="java.io.tmpdir"/>

<defaultCache
        maxEntriesLocalHeap="10000"
        eternal="false"
        timeToIdleSeconds="120"
        timeToLiveSeconds="120"
        diskSpoolBufferSizeMB="30"
        maxEntriesLocalDisk="10000000"
        diskExpiryThreadIntervalSeconds="120"
        memoryStoreEvictionPolicy="LRU"
        statistics="false">
    <persistence strategy="localTempSwap"/>
</defaultCache>

<cache name="myCache1"
       maxEntriesLocalHeap="10000"
       maxEntriesLocalDisk="10000"
       eternal="false"
       diskSpoolBufferSizeMB="20"
       timeToIdleSeconds="300"
       timeToLiveSeconds="600"
       memoryStoreEvictionPolicy="LFU"
       transactionalMode="off">
    <persistence strategy="localTempSwap"/>

    <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
</cache>

192.168.10.128192.168.10.1的邮件已成功路由。但相反的方向并不奏效。登录192.168.10.1

时出现以下错误
2014-07-11 02:02:19.260 +0400 DEBUG Lookup URL //192.168.10.128:51000/myCache1
2014-07-11 02:02:20.262 +0400 DEBUG Lookup URL //192.168.10.1:51000/myCache1
2014-07-11 02:02:21.264 +0400 WARN  Unable to send message to remote peer.  Message was: Connection refused to host: 127.0.0.1; nested exception is:
        java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
        java.net.ConnectException: Connection refused: connect
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[na:1.7.0_60]
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[na:1.7.0_60]
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[na:1.7.0_60]
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129) ~[na:1.7.0_60]
        at net.sf.ehcache.distribution.RMICachePeer_Stub.send(Unknown Source) ~[services.jar:1.1]
        at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.writeReplicationQueue(RMIAsynchronousCacheReplicator.java:314) [services.jar:1.1]
        at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.replicationThreadMain(RMIAsynchronousCacheReplicator.java:127) [services.jar:1.1]
        at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.access$000(RMIAsynchronousCacheReplicator.java:58) [services.jar:1.1]
        at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator$ReplicationThread.run(RMIAsynchronousCacheReplicator.java:389) [services.jar:1.1]
Caused by: java.net.ConnectException: Connection refused: connect
        at java.net.DualStackPlainSocketImpl.connect0(Native Method) ~[na:1.7.0_60]
        at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79) ~[na:1.7.0_60]
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) ~[na:1.7.0_60]
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) ~[na:1.7.0_60]
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) ~[na:1.7.0_60]
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) ~[na:1.7.0_60]
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.7.0_60]
        at java.net.Socket.connect(Socket.java:579) ~[na:1.7.0_60]
        at java.net.Socket.connect(Socket.java:528) ~[na:1.7.0_60]
        at java.net.Socket.<init>(Socket.java:425) ~[na:1.7.0_60]
        at java.net.Socket.<init>(Socket.java:208) ~[na:1.7.0_60]
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[na:1.7.0_60]
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147) ~[na:1.7.0_60]
        at net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory.createSocket(ConfigurableRMIClientSocketFactory.java:71) ~[services.jar:1.1]
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ~[na:1.7.0_60]
        ... 8 common frames omitted

如果我不在配置文件中的任何地方,为什么要尝试连接到127.0.0.1?

我可以从192.168.10.1 telnet到192.168.10.128:51000。

我也尝试启用bootstraping并开始看到以下日志消息

2014-07-11 02:35:30.515 +0400 DEBUG cache peers: [RMICachePeer_Stub[UnicastRef2 [liveRef: [endpoint:[127.0.0.1:18405,net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory@7d0](remote),objID:[-43892557:1472247d06b:-7fff, -5287536613776006259]]]]]
2014-07-11 02:35:30.516 +0400 DEBUG Bootstrapping myCache1 from RMICachePeer_Stub[UnicastRef2 [liveRef: [endpoint:[127.0.0.1:18405,net.sf.ehcache.distribution.ConfigurableRMIClientSocketFactory@7d0](remote),objID:[-43892557:1472247d06b:-7fff, -5287536613776006259]]]]

为什么我认为我有同行127.0.0.1:18405

1 个答案:

答案 0 :(得分:4)

经过JDK和ehcache源代码的多个小时调试后,我发现了它。

我的核心错误假设是我看到错误的Windows节点出了问题。原来是linux节点提供的地址不正确。

官方Ehcache常见问题says

  

这是由于2008年对Ubuntu / Debian Linux默认更改造成的   网络配置。基本上,Java调用   InetAddress.getLocalHost();总是返回环回地址   是127.0.0.1。为什么?因为在这些最近的发行版中,系统调用了$   hostname始终返回映射到环回设备的地址,   这导致Ehcache的RMI对等创建逻辑始终分配   环回地址,导致您看到的错误。一切你需要的   要做的就是破解网络配置并确保主机名   机器返回其他人可访问的有效网络地址   网络上的同行。

Linux节点在类java.rmi.registry.LocateRegistry

的以下方法中获得“127.0.0.1”
public static Registry getRegistry(String host, int port, RMIClientSocketFactory csf) throws RemoteException
{
    Registry registry = null;

    if (port <= 0)
        port = Registry.REGISTRY_PORT;

    if (host == null || host.length() == 0) {
        // If host is blank (as returned by "file:" URL in 1.0.2 used in
        // java.rmi.Naming), try to convert to real local host name so
        // that the RegistryImpl's checkAccess will not fail.
        try {
            host = java.net.InetAddress.getLocalHost().getHostAddress();
        } catch (Exception e) {
            // If that failed, at least try "" (localhost) anyway...
            host = "";
        }
    }

    LiveRef liveRef = new LiveRef(new ObjID(ObjID.REGISTRY_ID), new TCPEndpoint(host, port, csf, null), false);
    RemoteRef ref = (csf == null) ? new UnicastRef(liveRef) : new UnicastRef2(liveRef);

    return (Registry) Util.createProxy(RegistryImpl.class, ref, false);
}

我的Windows节点正在使用以下net.sf.ehcache.distribution.ManualRMICacheManagerPeerProvider类方法接收它,并调用lookupRemoteCachePeer

public final synchronized List listRemoteCachePeers(Ehcache cache) throws CacheException {
    List remoteCachePeers = new ArrayList();
    List staleList = new ArrayList();
    for (Iterator iterator = peerUrls.keySet().iterator(); iterator.hasNext();) {
        String rmiUrl = (String) iterator.next();
        String rmiUrlCacheName = extractCacheName(rmiUrl);

        if (!rmiUrlCacheName.equals(cache.getName())) {
            continue;
        }
        Date date = (Date) peerUrls.get(rmiUrl);
        if (!stale(date)) {
            CachePeer cachePeer = null;
            try {
                cachePeer = lookupRemoteCachePeer(rmiUrl);
                remoteCachePeers.add(cachePeer);
            } catch (Exception e) {
                if (LOG.isDebugEnabled()) {
                    LOG.debug("Looking up rmiUrl " + rmiUrl + " through exception " + e.getMessage()
                            + ". This may be normal if a node has gone offline. Or it may indicate network connectivity"
                            + " difficulties", e);
                }
            }
        } else {
                LOG.debug("rmiUrl {} should never be stale for a manually configured cluster.", rmiUrl);
            staleList.add(rmiUrl);
        }

    }

    //Remove any stale remote peers. Must be done here to avoid concurrent modification exception.
    for (int i = 0; i < staleList.size(); i++) {
        String rmiUrl = (String) staleList.get(i);
        peerUrls.remove(rmiUrl);
    }
    return remoteCachePeers;
}

Terracotta的官方建议是修改hosts文件,这对我来说太残酷了。我的结论是,Ops团队在我的服务器的命令行中提供正确的绑定地址会更容易,看起来像这样

java -Djava.rmi.server.hostname=192.168.10.128 -jar services.jar