ActiveMQ 5.9 - 由于未释放锁定而导致代理上的BLOCKED线程

时间:2018-05-16 18:09:58

标签: java apache-camel activemq

我正在使用ActiveMQ 5.9和Camel 2.10.3,并且在负载下(在性能测试期间)我遇到了一些问题,似乎代理人试图关闭连接,我无法理解原因。

JMS系统的配置如下:有两个代理(在故障转移模式下配置)和许多客户端节点,它们既充当消费者又充当某些队列的生产者(例如,让我们取一个:'customer_update queue'。

我正在使用PooledConnectionFactory和默认配置,'CACHE_CONSUMER'缓存级别和每个客户端节点10个最大并发使用者。

代理配置如下:tcp://0.0.0.0:61616?maximumConnections = 1000& wireFormat.maxFrameSize = 104857600

这是在代理上持有锁的线程,并且永远不会释放它:

"ActiveMQ Transport: tcp:///10.128.43.206:38694@61616" #5774 daemon prio=5 os_prio=0 tid=0x00007f2a4424e800 nid=0
xaba4 waiting on condition [0x00007f29fe397000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000004fd008fb0> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.
java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.activemq.broker.TransportConnection.stop(TransportConnection.java:983)
at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:699)
        - locked <0x000000050401eed0> (a java.lang.Object)
at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
at java.lang.Thread.run(Thread.java:748)

我在代理上有超过500个其他线程,如下所示:

"ActiveMQ Transport: tcp:///10.128.43.206:52074@61616" #2962 daemon prio=5 os_prio=0 tid=0x00007f2a440c9000 nid=0xa01f waiting for monitor entry [0x00007f29fc768000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:696)
        - waiting to lock <0x000000050401eed0> (a java.lang.Object)
        at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
        at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
        at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
        at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
        at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
        at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
        at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
        at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
        at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
        at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
        at java.lang.Thread.run(Thread.java:748)

我在经纪人身上看到的第一个错误就是这个错误:

2018-05-16 16:36:59,336 [org.apache.activemq.broker.TransportConnection.Transport:856] 
WARN  - Transport Connection to: tcp://10.128.43.206:48747 failed: java.io.EOFException

在代理(10.128.43.206)中引用的客户端节点上,我看到这些日志,似乎节点正在尝试重新连接,但是在它再次断开连接之后,这种情况一次又一次地发生。

2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect
2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect
2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:36:59,375 [org.apache.activemq.TransactionContext:856] INFO  - commit failed for transaction TX:ID:52374-1526300283331-1:1:898
javax.jms.TransactionRolledBackException: Transaction completion in doubt due to failover. Forcing rollback of TX:ID:52374-1526300283331-1:1:898
        at org.apache.activemq.state.ConnectionStateTracker.restoreTransactions(ConnectionStateTracker.java:231)
        at org.apache.activemq.state.ConnectionStateTracker.restore(ConnectionStateTracker.java:169)
        at org.apache.activemq.transport.failover.FailoverTransport.restoreTransport(FailoverTransport.java:827)
        at org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1005)
        at org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136)
        at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
        at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:37:00,112 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect

最后,经纪人达到maxConnections可用(1000),需要重新启动。

是否可能是因为一个客户端节点既作为使用者又作为生产者,使用相同的连接池,产生某种死锁?

你有什么建议吗?

由于

Giulio的

1 个答案:

答案 0 :(得分:0)

我很可能受到这个问题的影响:

https://issues.apache.org/jira/browse/AMQ-5090

更新到ActiveMQ 5.10.0和Camel 2.13.1解决了这个问题(即使在性能测试期间,系统也更加稳定)。

由于 Giulio的