MassTransit / RabbitMQ - 请求错误队列不一致

时间:2012-08-25 11:06:01

标签: c# .net rabbitmq amqp masstransit

我有一个使用ServiceStack构建的Web API,它通过MassTransit与位于RabbitMQ上的TopShelf srvice后端进行通信。大多数情况下,工作正常,但有时我们会开始看到大量请求开始进入错误队列的问题。

API相对简单,只有2个调用。根据我所知,并查看日志,主要调用工作正常。然而,另一个调用是用于确定节点状态的调用,并且正在引发问题。两种调用都以相同的方式实现,使用Request / Respond方法。

以下是其中一个错误的日志文件:

2012-08-25 00:01:28,544 [6] DEBUG MassTransit.Messages - SEND:rabbitmq://testapi:password@localhost:5672/test_API.Models:StatusMessage::test_API.Models.StatusMessage, Models
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] ERROR MassTransit.Transports.Endpoint - Message retry limit exceeded rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SEND:rabbitmq://testapi:password@localhost:5672/testapi_web_error:08cf3caf-6d81-0d5b-0050-56a800810000:
2012-08-25 00:01:28,575 [38] INFO  MassTransit.Messages - MOVE:rabbitmq://testapi:password@localhost:5672/testapi_web:rabbitmq://testapi:password@localhost:5672/testapi_web_error:08cf3caf-6d81-0d5b-0050-56a800810000:
2012-08-25 00:01:33,888 [29] ERROR ServiceStack.ServiceInterface.ServiceBase`1 - ServiceBase<TRequest>::Service Exception
MassTransit.Exceptions.RequestTimeoutException: The request timed out: 08cf4f95-1a2b-d545-0050-56a800810000
   at MassTransit.RequestResponse.RequestImpl`1.Wait() in d:\BuildAgent-03\work\8d1373c869590c5b\src\MassTransit\RequestResponse\RequestImpl.cs:line 107
   at test.API.StatusService.OnGet(Status request)
   at ServiceStack.ServiceInterface.RestServiceBase`1.Get(TRequest request)

我注意到它抱怨超时。我们尝试过可变长度,它似乎没什么帮助。看看与数据库的连接,它似乎不是消费者中实际上会阻止它的任何东西,所以我不确定超时是否是由于发送时的多次失败引起的,或者它是否仅仅是连接问题。通常,响应发生在不到一秒钟。

请求是:

this.DataBus.PublishRequest(new StatusMessage(request, RequestType.GET), x =>
{
    x.Handle<StatusResponseMessage>(message =>
    {
        if (message.Exception != null)
        {
            Response.Message = message.Exception.Message;
        }
        Response.Node = message.Response.Node;
        Response.Status = message.Response.Status;
    });
    x.SetTimeout(5.Seconds());
 });

消费者是:

public void Consume(IConsumeContext<StatusMessage> context)
    {
        try 
        {
            // Get status from database (not included)
            context.Respond(new StatusResponseMessage(context.Message.CorrelationId, new StatusResponsePacket(context.Message.Request.Node, status, lastUpdated)));
        }
        catch (Exception e)
        {
            context.Respond(new StatusResponseMessage(context.Message.CorrelationId, e));
        }

任何人都可以看到任何明显错误吗?我已经检查过API和服务正在读取不同的队列(错误最终在test_web_error队列中),以及检查在状态请求和响应数据包上是否设置了相关ID和CorrelatedBy。如上所述,这只发生在这次通话中,而不是我们的其他通话。我也有多个服务器运行它,它似乎只在一台服务器上同时发生,但并不总是相同的。

0 个答案:

没有答案