我正在使用Java RabbitMQ客户端从多线程代码对远程系统进行RPC样式调用,尽管我付出了最大的努力,但我无法使automatic recovery功能正常工作。在生产中,平均每天约一次掉落。当然,我在测试设置上找不到复制方法。
因此我的问题是:假设Java RabbitMQ客户端的自动恢复功能确实像宣传的那样工作,我必须误解一些基本的东西。谁能告诉我我做错了什么?
这里有一些类似的问题,但是他们通常会在后面添加一些答案,例如"从版本3.3.0开始,您可以使用自动恢复,这是Java的一个新功能客户。" 我只使用3.5.0之后的版本,因此我希望我的问题可以接受,因为我打算这样做,这非常不重复。
是的,我知道Lyra - 但如果我的问题的唯一答案是'你应该使用Lyra' 向我建议我的陈述'假设Java RabbitMQ客户端的自动恢复功能确实按照宣传的方式工作' 可能需要重新评估...
文档并没有给出很多东西,这可能是问题的一部分。因此,我将我的生产代码简化为最小化(是的...但我不知道我怎么能真正减少更多,仍然期待一个有用的答案 - 但我为长度道歉)应该显示的情况显然我在做什么。唯一缺少的是创建启用TLS的ConnectionFactory
的方法和位于RPC另一端的C#服务。希望这对于某人给我一些指示就足够了。
在生产中,此代码充满了日志记录,并且还仅为了记录而注册连接和通道恢复侦听器。在两个月的日志记录中,由于RabbitMQ相关代码中的故障大致每天都需要重新启动系统,不会日志显示恢复监听器或handleRecoverOk
正在调用。我看到的是对handleCancel
的调用,然后线程上的所有后续操作都会因超时或ConsumerCancelledException
而失败。
如果它有所不同,生产是在Solaris 11上。我试图通过使用TCPView关闭套接字在我的Windows PC上的测试用例中导致需求失败,但这会导致调用handleShutdownSignal
通过AlreadyClosedException
- 一种相当不同的失败模式。
在我的代码Consumer
中,已根据已弃用的QueueingConsumer
建模,但尝试支持恢复。我可以看到将毒物对象放入队列可能会破坏检测到故障时正在进行的呼叫的任何可能性(因为nextDelivery
可能会被调用并在有可能恢复之前导致异常发生),但我希望handleRecoverOk
在某个时刻再次移除毒药,留下一个功能齐全的线程。这似乎永远不会发生。
import com.rabbitmq.client.*;
import com.rabbitmq.client.AMQP.BasicProperties;
import com.rabbitmq.utility.Utility;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.time.Duration;
import java.util.UUID;
import java.util.concurrent.*;
import javax.net.ssl.SSLException;
public class RabbitMQCall {
private final String requestQueueName;
private final int maxRetries = 5;
private final Connection connection;
private volatile boolean closed = false;
private final ThreadLocal<PerThreadDetails> perThreadDetails = new ThreadLocal<>();
private final static Charset CHARSET = Charset.forName("UTF-8");
// Method to run minimal, reduced-to-one-class example code.
public static void main(String[] arg) throws Exception {
ConnectionFactory factory = setupTLSEnabledConnectionFactory();
factory.setAutomaticRecoveryEnabled(true);
factory.setTopologyRecoveryEnabled(true);
RabbitMQCall rmqcall = new RabbitMQCall(factory, "call");
String data = "{\"---METHOD---\":\"ECHO\",\"USERNAME\":\"Test\"}";
while (true) {
rmqcall.call(data, Duration.ofSeconds(20));
System.out.println("OK");
TimeUnit.SECONDS.sleep(1);
}
}
/**
* Constructs a new instance of {@code RabbitMQCall} configured with the supplied information.
* Automatic connection and topology recovery must be enabled on the {@code ConnectionFactory}
* as this class makes no attempt to reconnect in case of errors.
* @param connectionFactory the {@link com.rabbitmq.client.ConnectionFactory} specifying all the
* details for the connection to the RabbitMQ server
* @param requestQueueName the request queue name
* @throws IllegalArgumentException if the {@code ConnectionFactory} has invalid settings
* @throws IOException in case of error setting up the RabbitMQ connection
* @throws TimeoutException in case of timeout setting up the RabbitMQ connection
*/
public RabbitMQCall(ConnectionFactory connectionFactory, String requestQueueName)
throws IOException, TimeoutException {
this.requestQueueName = requestQueueName;
connection = connectionFactory.newConnection();
}
/**
* Calls a remote method (passing the supplied data) using RabbitMQ with the specified timeout.
* Up to the configured number of retries of the complete operation will be attempted before an
* {@link Exception} is thrown.
* @param data a textual representation of the data for the call
* @param timeout the timeout to use when awaiting a response from the remote process, or
* {@code null} to wait forever
* @return a textual representation of the result or error
* @throws NullPointerException if either argument is null
* @throws IllegalArgumentException if {@code timeout} is negative
* @throws Exception if there was a failure invoking the remote call (rather than in the
* operation of the remote call itself)
* @throws InterruptedException if any thread has interrupted the current thread. The
* <i>interrupted status</i> of the current thread is cleared when this exception is thrown.
*/
public String call(String data, Duration timeout) throws Exception, InterruptedException {
if (closed) {
throw (new Exception("RabbitMQCall.call() cannot be called on a closed instance"));
}
if (timeout != null) {
if (timeout.isNegative() || timeout.isZero()) {
throw (new IllegalArgumentException("The timeout must be positive or null"));
}
}
// See if this thread already has a channel set up, and create one if not.
PerThreadDetails details = perThreadDetails.get();
if (details == null) {
details = new PerThreadDetails();
try {
details.channel = connection.createChannel();
if (!(details.channel instanceof Recoverable)) {
throw (new AssertionError("Channel doesn't implement Recoverable"));
}
details.replyQueueName = details.channel.queueDeclare().getQueue();
details.consumer = new Consumer(details.channel);
details.channel.basicConsume(details.replyQueueName, true, details.consumer);
} catch (IOException e) {
throw (new Exception("Failed to set up RabbitMQ Channel: " + e, e));
}
perThreadDetails.set(details);
}
// Wrap the whole thing in a retry loop to handle timeouts.
RETRY: for (int attempt = 0; attempt <= maxRetries; ++attempt) {
String corrId = UUID.randomUUID().toString();
BasicProperties props = new BasicProperties.Builder()
.correlationId(corrId)
.replyTo(details.replyQueueName)
.build();
// Encode the data with the appropriate character encoding into a byte array.
ByteBuffer bb = CHARSET.encode(data);
byte[] bytes = new byte[bb.remaining()];
bb.get(bytes);
try {
details.channel.basicPublish("", requestQueueName, props, bytes);
} catch (IOException e) {
throw (new Exception("Error publishing to RabbitMQ Channel: " + e, e));
}
// Loop receiving messages until we get the one we're waiting for.
while (true) {
Consumer.Delivery delivery = null;
if (timeout != null) {
delivery = details.consumer.nextDelivery(timeout);
} else {
delivery = details.consumer.nextDelivery();
}
if (delivery == null) {
break; // Break out of inner loop for the next iteration of the retry loop.
}
String response = new String(delivery.getBody(), CHARSET);
// If response matches our request then we have what we're waiting for, so return.
if (delivery.getProperties().getCorrelationId().equals(corrId)) {
return response;
}
}
}
throw (new Exception("Timeout waiting for response from remote system via RabbitMQ"));
}
public void close() {
if (!closed) {
closed = true;
if (connection != null) {
connection.abort();
}
}
}
private static class Consumer extends DefaultConsumer {
private static final Delivery CANCELLED = new Delivery(null, null, null);
private static final Delivery SHUTDOWN = new Delivery(null, null, null);
private final BlockingQueue<Delivery> queue = new LinkedBlockingQueue<>();
private volatile ShutdownSignalException shutdownException;
Consumer(Channel channel) {
super(channel);
}
Delivery nextDelivery() throws InterruptedException, ShutdownSignalException,
ConsumerCancelledException {
Delivery d = queue.take();
return processDelivery(d);
}
Delivery nextDelivery(Duration timeout) throws InterruptedException,
ShutdownSignalException, ConsumerCancelledException {
Delivery d = queue.poll(timeout.toMillis(), TimeUnit.MILLISECONDS);
return processDelivery(d);
}
private Delivery processDelivery(Delivery d) {
if (d == SHUTDOWN) {
queue.add(SHUTDOWN);
throw (Utility.fixStackTrace(shutdownException));
}
if (d == CANCELLED) {
throw (new ConsumerCancelledException());
}
return d;
}
@Override public void handleConsumeOk(String consumerTag) {
super.handleConsumeOk(consumerTag);
}
@Override public void handleCancel(String consumerTag) {
queue.add(CANCELLED);
}
@Override public void handleCancelOk(String consumerTag) {
queue.add(CANCELLED);
}
@Override public void handleDelivery(String consumerTag, Envelope envelope,
AMQP.BasicProperties properties, byte[] body) {
if (shutdownException != null) {
throw (Utility.fixStackTrace(shutdownException));
}
queue.add(new Delivery(envelope, properties, body));
}
@Override public void handleRecoverOk(String consumerTag) {
super.handleConsumeOk(consumerTag); // Set the new tag in the only way we can.
while (queue.contains(CANCELLED)) { // Remove our poison message(s).
queue.remove(CANCELLED);
}
}
@Override public void handleShutdownSignal(String consumerTag, ShutdownSignalException sig) {
shutdownException = sig;
queue.add(SHUTDOWN);
}
private static class Delivery {
private final Envelope envelope;
private final AMQP.BasicProperties properties;
private final byte[] body;
Delivery(Envelope envelope, AMQP.BasicProperties properties, byte[] body) {
this.envelope = envelope;
this.properties = properties;
this.body = body;
}
public byte[] getBody() {
return body;
}
public Envelope getEnvelope() {
return envelope;
}
public AMQP.BasicProperties getProperties() {
return properties;
}
}
}
private static class PerThreadDetails {
public Channel channel;
public String replyQueueName;
public Consumer consumer;
}
}
我一直在使用artefact amqp-client 3.5.6进行大部分测试, 虽然最新的3.5.7似乎没有改变任何东西。
感谢您的帮助。
答案 0 :(得分:0)
尝试使用
factory.setConnectionTimeout(30000);
factory.setAutomaticRecoveryEnabled(true);
factory.setTopologyRecoveryEnabled(true);
factory.setNetworkRecoveryInterval(10000);
factory.setExceptionHandler(new DefaultExceptionHandler());
factory.setRequestedHeartbeat(360);`
在打开连接之前添加它。还有一个建议,不要使用已弃用的类,使用DefaultConsumer
而不是QueingConsumer