我有一个使用ServiceStack构建的Web API,它通过MassTransit与位于RabbitMQ上的TopShelf srvice后端进行通信。大多数情况下,工作正常,但有时我们会开始看到大量请求开始进入错误队列的问题。
API相对简单,只有2个调用。根据我所知,并查看日志,主要调用工作正常。然而,另一个调用是用于确定节点状态的调用,并且正在引发问题。两种调用都以相同的方式实现,使用Request / Respond方法。
以下是其中一个错误的日志文件:
2012-08-25 00:01:28,544 [6] DEBUG MassTransit.Messages - SEND:rabbitmq://testapi:password@localhost:5672/test_API.Models:StatusMessage::test_API.Models.StatusMessage, Models
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SKIP:rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] ERROR MassTransit.Transports.Endpoint - Message retry limit exceeded rabbitmq://testapi:password@localhost:5672/testapi_web:08cf3caf-6d81-0d5b-0050-56a800810000
2012-08-25 00:01:28,575 [38] DEBUG MassTransit.Messages - SEND:rabbitmq://testapi:password@localhost:5672/testapi_web_error:08cf3caf-6d81-0d5b-0050-56a800810000:
2012-08-25 00:01:28,575 [38] INFO MassTransit.Messages - MOVE:rabbitmq://testapi:password@localhost:5672/testapi_web:rabbitmq://testapi:password@localhost:5672/testapi_web_error:08cf3caf-6d81-0d5b-0050-56a800810000:
2012-08-25 00:01:33,888 [29] ERROR ServiceStack.ServiceInterface.ServiceBase`1 - ServiceBase<TRequest>::Service Exception
MassTransit.Exceptions.RequestTimeoutException: The request timed out: 08cf4f95-1a2b-d545-0050-56a800810000
at MassTransit.RequestResponse.RequestImpl`1.Wait() in d:\BuildAgent-03\work\8d1373c869590c5b\src\MassTransit\RequestResponse\RequestImpl.cs:line 107
at test.API.StatusService.OnGet(Status request)
at ServiceStack.ServiceInterface.RestServiceBase`1.Get(TRequest request)
我注意到它抱怨超时。我们尝试过可变长度,它似乎没什么帮助。看看与数据库的连接,它似乎不是消费者中实际上会阻止它的任何东西,所以我不确定超时是否是由于发送时的多次失败引起的,或者它是否仅仅是连接问题。通常,响应发生在不到一秒钟。
请求是:
this.DataBus.PublishRequest(new StatusMessage(request, RequestType.GET), x =>
{
x.Handle<StatusResponseMessage>(message =>
{
if (message.Exception != null)
{
Response.Message = message.Exception.Message;
}
Response.Node = message.Response.Node;
Response.Status = message.Response.Status;
});
x.SetTimeout(5.Seconds());
});
消费者是:
public void Consume(IConsumeContext<StatusMessage> context)
{
try
{
// Get status from database (not included)
context.Respond(new StatusResponseMessage(context.Message.CorrelationId, new StatusResponsePacket(context.Message.Request.Node, status, lastUpdated)));
}
catch (Exception e)
{
context.Respond(new StatusResponseMessage(context.Message.CorrelationId, e));
}
任何人都可以看到任何明显错误吗?我已经检查过API和服务正在读取不同的队列(错误最终在test_web_error队列中),以及检查在状态请求和响应数据包上是否设置了相关ID和CorrelatedBy。如上所述,这只发生在这次通话中,而不是我们的其他通话。我也有多个服务器运行它,它似乎只在一台服务器上同时发生,但并不总是相同的。