点亮群集不同的启动时间

时间:2018-05-11 21:49:38

标签: java ignite

我在运行之间获得了非常不同的启动/连接时间。我的群集有三个服务器节点。从我的客户端节点(实际上位于三个服务器之一)我想运行一些任务和缓存操作进行测试。但是,当我启动客户端时,实际连接可能需要长达五分钟。在另一个客户端启动时,使用相同的客户端和相同的配置只需几秒钟。

在客户端节点启动需要很长时间的情况下,日志的差异是:

[13:35:31,649][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Your version is up to date.
[13:37:21,794][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], node=eec8ea18-ded1-42cd-aec7-2af754644008]. Dumping pending objects that might be the cause: 
[13:37:21,794][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Ready affinity version: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
[13:37:21,802][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Last exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=eec8ea18, msg=null, type=NODE_JOINED, tstamp=1526060121610], crd=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=eec8ea18, msg=null, type=NODE_JOINED, tstamp=1526060121610], nodeId=eec8ea18, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true, hash=2024590198], init=true, lastVer=null, partReleaseFut=null, exchActions=ExchangeActions [startCaches=null, stopCaches=null, startGrps=[], stopGrps=[], resetParts=null, stateChangeRequest=null], affChangeMsg=null, initTs=1526060121650, centralizedAff=false, changeGlobalStateE=null, done=false, state=CLIENT, evtLatch=0, remaining=[830bbef7-0344-4955-bdf6-ff90f6d96602, b0105fdc-5298-4f80-94ae-2f1bbd8b42e8, c74ff028-1676-4f1a-8c95-563763ea5875], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=189344266]]
[13:37:21,803][WARNING][exchange-worker-#157%Test Cluster%][GridCachePartitionExchangeManager] First 10 pending exchange futures [total=0]
[13:37:21,806][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Last 10 exchange futures (total: 1):
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] >>> GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], done=false]
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending transactions:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending explicit locks:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending cache futures:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending atomic cache futures:
[13:37:21,808][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending data streamer futures:
[13:37:21,808][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending transaction deadlock detection futures:
[13:37:21,840][INFO][sys-#158%Test Cluster%][diagnostic] Exchange future waiting for coordinator response [crd=c74ff028-1676-4f1a-8c95-563763ea5875, topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0]]
Remote node information:
General node info [id=c74ff028-1676-4f1a-8c95-563763ea5875, client=false, discoTopVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], time=13:37:21.812]
Partitions exchange info [readyVer=AffinityTopologyVersion [topVer=14, minorTopVer=0]]
Last initialized exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=15, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], type=NODE_JOINED, tstamp=1526060060363], crd=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526059855998, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=15, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], type=NODE_JOINED, tstamp=1526060060363], nodeId=830bbef7, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true, hash=1568621067], init=true, lastVer=GridCacheVersion [topVer=0, order=1526059954164, nodeOrder=0], partReleaseFut=PartitionReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], TxReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], AtomicUpdateReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]]]], exchActions=null, affChangeMsg=null, initTs=1526060227922, centralizedAff=false, changeGlobalStateE=null, done=false, state=CRD, evtLatch=0, remaining=[830bbef7-0344-4955-bdf6-ff90f6d96602], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=410898272]]
Communication SPI statistics [rmtNode=eec8ea18-ded1-42cd-aec7-2af754644008]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=eec8ea18-ded1-42cd-aec7-2af754644008, idx=0, connCnt=0], msgsSent=0, msgsAckedByRmt=0, msgsRcvd=2, lastAcked=0, reserveCnt=1, descIdHash=310748176]
Communication SPI clients: 
    [node=eec8ea18-ded1-42cd-aec7-2af754644008, client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=4, bytesRcvd=961, bytesSent=28, bytesRcvd0=853, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-4, igniteInstanceName=Test Cluster, finished=false, hashCode=474105904, interrupted=false, runner=grid-nio-worker-tcp-comm-4-#125%Test Cluster%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=2, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=2, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], super=GridNioSessionImpl [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:59666, createTime=1526060121775, closeTime=0, bytesSent=28, bytesRcvd=961, bytesSent0=0, bytesRcvd0=853, sndSchedTime=1526060121775, lastSndTime=1526060121786, lastRcvTime=1526060241812, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@3f6752aa, directMode=true], GridConnectionBytesVerifyFilter], accepted=true]], super=GridAbstractCommunicationClient [lastUsed=1526060121786, closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=4, keysCnt=1, bytesRcvd=961, bytesRcvd0=853, bytesSent=28, bytesSent0=0]
    Connection info [in=true, rmtAddr=/127.0.0.1:59666, locAddr=/127.0.0.1:47100, msgsSent=0, msgsAckedByRmt=0, descIdHash=310748176, msgsRcvd=2, lastAcked=0, descIdHash=310748176, bytesRcvd=961, bytesRcvd0=853, bytesSent=28, bytesSent0=0, opQueueSize=0]
Exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], type=NODE_JOINED, tstamp=1526060116548], crd=null, exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], type=NODE_JOINED, tstamp=1526060116548], nodeId=eec8ea18, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=1818763044], init=false, lastVer=null, partReleaseFut=null, exchActions=null, affChangeMsg=null, initTs=0, centralizedAff=false, changeGlobalStateE=null, done=false, state=null, evtLatch=0, remaining=[], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=1650837648]]
Local communication statistics:
Communication SPI statistics [rmtNode=c74ff028-1676-4f1a-8c95-563763ea5875]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=c74ff028-1676-4f1a-8c95-563763ea5875, idx=0, connCnt=-1], msgsSent=2, msgsAckedByRmt=0, msgsRcvd=1, lastAcked=0, reserveCnt=1, descIdHash=1306648390]
Communication SPI clients: 
    [node=c74ff028-1676-4f1a-8c95-563763ea5875, client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=8421, bytesSent=919, bytesRcvd0=8421, bytesSent0=853, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=Test Cluster, finished=false, hashCode=1972519349, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#121%Test Cluster%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=1, sentCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=1, sentCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], super=GridNioSessionImpl [locAddr=/127.0.0.1:59666, rmtAddr=/127.0.0.1:47100, createTime=1526060121782, closeTime=0, bytesSent=919, bytesRcvd=8421, bytesSent0=853, bytesRcvd0=8421, sndSchedTime=1526060121782, lastSndTime=1526060241815, lastRcvTime=1526060241815, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@619e0deb, directMode=true], GridConnectionBytesVerifyFilter], accepted=false]], super=GridAbstractCommunicationClient [lastUsed=1526060121792, closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=0, keysCnt=1, bytesRcvd=8421, bytesRcvd0=8421, bytesSent=919, bytesSent0=853]
    Connection info [in=false, rmtAddr=/127.0.0.1:47100, locAddr=/127.0.0.1:59666, msgsSent=2, msgsAckedByRmt=0, descIdHash=1306648390, unackedMsgs=[GridDhtPartitionsSingleMessage, IgniteDiagnosticMessage], msgsRcvd=1, lastAcked=0, descIdHash=1306648390, bytesRcvd=8421, bytesRcvd0=8421, bytesSent=919, bytesSent0=853, opQueueSize=0]
[13:39:21,652][WARNING][main][GridCachePartitionExchangeManager] Failed to wait for initial partition map exchange. Possible reasons are: 
  ^-- Transactions in deadlock.
  ^-- Long running transactions (ignore if this is the case).
  ^-- Unreleased explicit locks.
[13:39:21,817][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], node=eec8ea18-ded1-42cd-aec7-2af754644008]. Dumping pending objects that might be the cause: 
[13:40:43,347][INFO][sys-#159%Test Cluster%][GridDhtPartitionsExchangeFuture] Received full message, will finish exchange [node=c74ff028-1676-4f1a-8c95-563763ea5875, resVer=AffinityTopologyVersion [topVer=16, minorTopVer=0]]
[13:40:43,354][INFO][sys-#159%Test Cluster%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], err=null]
[13:40:43,395][INFO][main][IgniteKernal%Test Cluster] Performance suggestions for grid 'Test Cluster' (fix if possible)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Speed up flushing of dirty pages by OS (alter vm.dirty_expire_centisecs parameter by setting to 500)
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster]   ^-- Reduce pages swapping ratio (set vm.swappiness=10)
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] 
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
[13:40:43,398][INFO][main][IgniteKernal%Test Cluster] 
[13:40:43,401][INFO][grid-nio-worker-tcp-comm-1-#122%Test Cluster%][TcpCommunicationSpi] Established outgoing communication connection [locAddr=/192.168.0.162:40742, rmtAddr=/192.168.0.161:47100]
[13:40:43,403][INFO][main][IgniteKernal%Test Cluster] 

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.4.0#20180305-sha1:aa342270b13cc1f4713382a8eb23b2eb7edaa3a5
>>> +----------------------------------------------------------------------+
>>> OS name: Linux 3.10.0-693.el7.x86_64 amd64
>>> CPU(s): 56
>>> Heap: 6.9GB
>>> VM name: 78579@centos_node_2
>>> Ignite instance name: Test Cluster
>>> Local node [ID=EEC8EA18-DED1-42CD-AEC7-2AF754644008, order=16, clientMode=true]
>>> Local node addresses: [centos_node_2/0:0:0:0:0:0:0:1%lo, centos_node_2/127.0.0.1, /192.168.0.162, /192.168.122.1]
>>> Local ports: TCP:10801 TCP:47101 

[13:40:43,406][INFO][main][GridDiscoveryManager] Topology snapshot [ver=16, servers=3, clients=1, CPUs=168, offheap=16.0GB, heap=19.0GB]
[13:40:43,406][INFO][main][GridDiscoveryManager] Data Regions Configured:
[13:40:43,406][INFO][main][GridDiscoveryManager]   ^-- default [initSize=4.0 GiB, maxSize=4.0 GiB, persistenceEnabled=false]
[13:40:43,413][INFO][main][GridDeploymentLocalStore] Class locally deployed: class TestCluster$1
[13:40:45,026][INFO][exchange-worker-#157%Test Cluster%][time] Started exchange init [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], crd=false, evt=DISCOVERY_CUSTOM_EVT, evtNode=c74ff028-1676-4f1a-8c95-563763ea5875, customEvt=CacheAffinityChangeMessage [id=c6771405361-ef621a9a-86e4-426a-958d-c53f0d9c0e25, topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], exchId=null, partsMsg=null, exchangeNeeded=true], allowMerge=false]
[13:40:45,028][INFO][exchange-worker-#157%Test Cluster%][time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], crd=false]
[13:40:45,037][INFO][sys-#165%Test Cluster%][GridDhtPartitionsExchangeFuture] Received full message, will finish exchange [node=c74ff028-1676-4f1a-8c95-563763ea5875, resVer=AffinityTopologyVersion [topVer=16, minorTopVer=1]]
[13:40:45,039][INFO][sys-#165%Test Cluster%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], err=null]
[13:40:48,545][INFO][grid-nio-worker-tcp-comm-2-#123%Test Cluster%][TcpCommunicationSpi] Established outgoing communication connection [locAddr=/192.168.0.162:35242, rmtAddr=/192.168.0.4:47100]
[13:40:48,597][INFO][main][GridDeploymentLocalStore] Class locally deployed: class TestCluster$2
[13:40:48,676][INFO][main][GridCacheProcessor] Stopped cache [cacheName=ignite-sys-cache]
[13:40:48,678][INFO][main][GridDeploymentLocalStore] Removed undeployed class: GridDeployment [ts=1526060443326, depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@330bedb4, clsLdrId=85655405361-eec8ea18-ded1-42cd-aec7-2af754644008, userVer=0, loc=true, sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap, pendingUndeploy=false, undeployed=true, usage=0]
[13:40:48,684][INFO][main][IgniteKernal%Test Cluster] 

>>> +---------------------------------------------------------------------------------+
>>> Ignite ver. 2.4.0#20180305-sha1:aa342270b13cc1f4713382a8eb23b2eb7edaa3a5 stopped OK
>>> +---------------------------------------------------------------------------------+
>>> Ignite instance name: Test Cluster
>>> Grid uptime: 00:00:05.289

群集配置是:

<?xml version="1.0" encoding="UTF-8"?>

<!-- This file was generated by Ignite Web Console (05/11/2018, 23:29) -->

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans.xsd
                           http://www.springframework.org/schema/util
                           http://www.springframework.org/schema/util/spring-util.xsd">
    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="igniteInstanceName" value="Test Cluster"/>

        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                        <property name="addresses">
                            <list>
                                <value>192.168.0.4:47500..47510</value>
                                <value>192.168.0.161:47500..47510</value>
                                <value>192.168.0.162:47500..47510</value>
                            </list>
                        </property>
                    </bean>
                </property>

                <property name="ackTimeout" value="50000"/>
            </bean>
        </property>

        <property name="communicationSpi">
            <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                <property name="connectTimeout" value="600000"/>
            </bean>
        </property>

        <property name="networkTimeout" value="60000"/>
        <property name="networkSendRetryCount" value="10"/>

        <property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
                <property name="defaultDataRegionConfiguration">
                    <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="initialSize" value="4294967296"/>
                        <property name="maxSize" value="4294967296"/>
                    </bean>
                </property>
            </bean>
        </property>

        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="eventStorageSpi">
            <bean class="org.apache.ignite.spi.eventstorage.memory.MemoryEventStorageSpi">
            </bean>
        </property>
        <property name="failureDetectionTimeout" value="100000"/>
        <property name="clientFailureDetectionTimeout" value="100000"/>
    </bean>
</beans>

为什么客户端节点连接需要这么长时间?为什么有时呢?

感谢您的帮助。

EDITED 启动期间的警告:

07:19:46.910 [main][1] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi-[warning] Failure detection timeout will be ignored (one of SPI parameters has been set explicitly)
07:20:06.953 [main][1] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi-[warning] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
07:20:06.977 [main][1] WARN  org.apache.ignite.spi.checkpoint.noop.NoopCheckpointSpi-[warning] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
07:20:07.012 [main][1] WARN  org.apache.ignite.internal.managers.collision.GridCollisionManager-[warning] Collision resolution is disabled (all jobs will be activated upon arrival).
07:20:22.373 [main][1] WARN  org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi-[warning] Failure detection timeout will be ignored (one of SPI parameters has been set explicitly)
07:20:47.527 [main][1] WARN  org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi-[warning] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

2 个答案:

答案 0 :(得分:1)

当新节点加入群集时,需要完成当前群集操作以注册新群集拓扑。 请注意下面的警告。

[13:39:21,652][WARNING][main][GridCachePartitionExchangeManager] Failed to wait for initial partition map exchange. Possible reasons are: 
  ^-- Transactions in deadlock.
  ^-- Long running transactions (ignore if this is the case).
  ^-- Unreleased explicit locks.

很可能你有一个长期交易或未发行的锁定。

答案 1 :(得分:0)

如果您实际上没有在群集中执行任何操作,则问题几乎肯定与网络问题和网络配置有关。我会尝试减少超时,看看它是否有帮助。

例如,您有ackTimeout=50000。这意味着在客户端向服务器发送消息后,它会等待50秒进行响应。如果消息丢失,它将仅在50秒后重试 - 因此单个网络错误花费您将近1分钟。将超时减少到较低的值应该有助于相对快速但不稳定的网络。