ICacheLock上的Apache Ignite.NET TryEnter在网络通信错误时返回false,而不是抛出异常

时间:2018-04-12 00:54:44

标签: c# .net ignite

这是情景。

  1. 发生网络问题
  2. Apache Ignite.NET集群有1个节点被分段。我可以在日志中看到这个,有问题的节点记录NodeSegmented事件
  3. 如果您从 ICache 对象获得 ICacheLock 对象,然后尝试使用 TryEnter()输入锁定,则在分段节点上获取返回值 false 。不是因为缓存密钥已被锁定,而是因为这种网络分段而出现奇怪的情况。
  4. 重新启动分段节点,它重新加入群集并按预期工作。
  5. 这是我发生此事件时在日志中看到的堆栈跟踪:

    Failed to send unlock request to node (will make best effort to complete): TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false]] Native:[class org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[/10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false], topic=TOPIC_CACHE, msg=GridDhtUnlockRequest [], policy=2]
        at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1651)
        at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1715)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1141)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter.removeLocks(GridDhtTransactionalCacheAdapter.java:1652)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.undoLocks(GridDhtLockFuture.java:425)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:719)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onDone(GridDhtLockFuture.java:703)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onDone(GridDhtLockFuture.java:82)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:461)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:129)
        at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:45)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:382)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:346)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:334)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:494)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:473)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:461)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$MiniFuture.onResult(GridDhtLockFuture.java:1191)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.map(GridDhtLockFuture.java:959)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onOwnerChanged(GridDhtLockFuture.java:655)
        at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.notifyOwnerChanged(GridCacheMvccManager.java:226)
        at org.apache.ignite.internal.processors.cache.GridCacheMvccManager.access$200(GridCacheMvccManager.java:80)
        at org.apache.ignite.internal.processors.cache.GridCacheMvccManager$3.onOwnerChanged(GridCacheMvccManager.java:163)
        at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkOwnerChanged(GridCacheMapEntry.java:3669)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheEntry.readyLock(GridDistributedCacheEntry.java:469)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.readyLocks(GridDhtLockFuture.java:567)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.map(GridDhtLockFuture.java:764)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync0(GridDhtColocatedCache.java:1066)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:937)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.lockLocally(GridDhtColocatedLockFuture.java:1171)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapAsPrimary(GridDhtColocatedLockFuture.java:1282)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map0(GridDhtColocatedLockFuture.java:852)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:813)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:772)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:720)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:664)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.lockAllAsync(GridDistributedCacheAdapter.java:117)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.lockAll(GridCacheAdapter.java:3258)
        at org.apache.ignite.internal.processors.cache.CacheLockImpl.tryLock(CacheLockImpl.java:109)
        at org.apache.ignite.internal.processors.cache.CacheLockImpl.tryLock(CacheLockImpl.java:130)
        at org.apache.ignite.internal.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:524)
        at org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:65)
    Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2544)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2480)
        at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1643)
        ... 41 more
    Caused by: class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104:47100]]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3179)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2763)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2655)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2516)
        ... 43 more
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=10.20.18.104:47100, err=Failed to read remote node recovery handshake (connection closed).]
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
            ... 46 more
        Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read remote node recovery handshake (connection closed).
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeHandshake(TcpCommunicationSpi.java:3438)
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3044)
            ... 46 more
    ]
    

    略有不同:

    Level: [Error], Message:[<ResoDupCheck> Failed to send unlock request [keys=[UserKeyCacheObjectImpl [part=482, val=201804141800-2-190327-110016411351-pat-clarkson-greene, hasValBytes=true]], n=TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false]]] Native:[class org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false], topic=TOPIC_CACHE, msg=GridNearUnlockRequest [super=GridDistributedUnlockRequest [keys=[UserKeyCacheObjectImpl [part=482, val=201804141800-2-190327-110016411351-pat-clarkson-greene, hasValBytes=true]], super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=121528584, order=1523348164577, nodeOrder=3178], committedVers=[], rolledbackVers=[], cnt=1, super=GridCacheIdMessage [cacheId=-1009505448]]]], policy=2]
        at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1651)
        at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1715)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1141)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.removeLocks(GridDhtColocatedCache.java:877)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.undoLocks(GridDhtColocatedLockFuture.java:383)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.onComplete(GridDhtColocatedLockFuture.java:575)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.onDone(GridDhtColocatedLockFuture.java:559)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:819)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:772)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:720)
        at org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:664)
        at org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.lockAllAsync(GridDistributedCacheAdapter.java:117)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.lockAll(GridCacheAdapter.java:3258)
        at org.apache.ignite.internal.processors.cache.CacheLockImpl.tryLock(CacheLockImpl.java:109)
        at org.apache.ignite.internal.processors.cache.CacheLockImpl.tryLock(CacheLockImpl.java:130)
        at org.apache.ignite.internal.processors.platform.cache.PlatformCache.processInStreamOutLong(PlatformCache.java:524)
        at org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutLong(PlatformTargetProxyImpl.java:65)
    Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104], sockAddrs=[10.20.18.104:49100], discPort=49100, order=3174, intOrder=1590, lastExchangeTime=1523347291158, loc=false, ver=2.1.0#20170720-sha1:bdaeecca, isClient=false]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2544)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2480)
        at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1643)
        ... 16 more
    Caused by: class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=d8b54715-4597-410c-a027-3c76d28ec7f1, addrs=[10.20.18.104:47100]]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3179)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2763)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2655)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2516)
        ... 18 more
        Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=10.20.18.104:47100, err=Failed to read remote node recovery handshake (connection closed).]
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3184)
            ... 21 more
        Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read remote node recovery handshake (connection closed).
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.safeHandshake(TcpCommunicationSpi.java:3438)
            at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3044)
            ... 21 more
    ]
    

    我的主要问题是, ICacheLock 为什么不抛出异常?通过返回false,它错误地告诉我缓存键已被锁定。因为我无法知道错误是由于某些网络问题或由于缓存密钥已被锁定所致。

    我目前的解决方案是:将侦听器添加到NodeSegment本地事件并关闭/重新启动Ignite节点。使用来自Polly的断路器的防御性备份计划来检查超过50%的请求是否在30秒内未能获得锁定。这应该是不太可能的情况,并且将导致跳过锁定调用并且在没有(处于降级状态)的情况下继续进行。

    我在Ignite.NET配置中遗漏了什么?

    我是否对Ignite的工作原理缺乏了解?

    是否有一些程序化的方法可以了解为什么TryEnter调用返回false并决定如何继续?

1 个答案:

答案 0 :(得分:1)

看起来Ignite没有将Java部分的异常传播到.NET部分。如果我们尝试在Java API中执行相同的操作,则tryEnter()会抛出javax.cache.CacheException。

我已创建了解决此问题的Jira票证:https://issues.apache.org/jira/browse/IGNITE-8247

另外,请确保缓存中存在密钥(您尝试锁定的密钥)。

作为解决方法,您可以为ClientDisconnected,ClientReconnected事件添加自己的侦听器。这是一个例子:

connection = db.engine.raw_connection()
cursor = connection.cursor()