ServiceFabric:服务在部署期间不存在

时间:2018-10-23 23:41:54

标签: error-handling publish azure-service-fabric

我有一个使用服务结构的现有系统。一切正常,除非在服务发布期间服务不可用并且任何解决方案都返回错误。

这是预期的,但是如果在这段时间内呼叫只是等待或超时,那将是很好的选择。在这段时间内,我的错误日志有时会填满同一错误的200K行。

我想要类似下面的代码,但是它会去哪里?

public async Task Execute(Func<Task> action)
{
    try
    {
        action()
            .ConfigureAwait(false);
    }
    catch (FabricServiceNotFoundException ex)
    {
        await Task.Delay(TimeSpan.FromSeconds(??))
            .ConfigureAwait(false);

        action()
            .ConfigureAwait(false);
    }

}

错误:

System.Fabric.FabricServiceNotFoundException: Service does not exist. ---> System.Runtime.InteropServices.COMException: Exception from HRESULT: 0x80071BCD
   at System.Fabric.Interop.NativeClient.IFabricServiceManagementClient6.EndResolveServicePartition(IFabricAsyncOperationContext context)
   at System.Fabric.FabricClient.ServiceManagementClient.ResolveServicePartitionEndWrapper(IFabricAsyncOperationContext context)
   at System.Fabric.Interop.AsyncCallOutAdapter2`1.Finish(IFabricAsyncOperationContext context, Boolean expectedCompletedSynchronously)
   --- End of inner exception stack trace ---
   at Microsoft.ServiceFabric.Services.Client.ServicePartitionResolver.ResolveHelperAsync(Func`5 resolveFunc, ResolvedServicePartition previousRsp, TimeSpan resolveTimeout, TimeSpan maxRetryInterval, CancellationToken cancellationToken, Uri serviceUri)
   at Microsoft.ServiceFabric.Services.Communication.Client.CommunicationClientFactoryBase`1.CreateClientWithRetriesAsync(ResolvedServicePartition previousRsp, TargetReplicaSelector targetReplicaSelector, String listenerName, OperationRetrySettings retrySettings, Boolean doInitialResolve, CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Communication.Client.CommunicationClientFactoryBase`1.GetClientAsync(ResolvedServicePartition previousRsp, TargetReplicaSelector targetReplica, String listenerName, OperationRetrySettings retrySettings, CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Remoting.V2.FabricTransport.Client.FabricTransportServiceRemotingClientFactory.GetClientAsync(ResolvedServicePartition previousRsp, TargetReplicaSelector targetReplicaSelector, String listenerName, OperationRetrySettings retrySettings, CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Communication.Client.ServicePartitionClient`1.GetCommunicationClientAsync(CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Communication.Client.ServicePartitionClient`1.InvokeWithRetryAsync[TResult](Func`2 func, CancellationToken cancellationToken, Type[] doNotRetryExceptionTypes)
   at Microsoft.ServiceFabric.Services.Remoting.V2.Client.ServiceRemotingPartitionClient.InvokeAsync(IServiceRemotingRequestMessage remotingRequestMessage, String methodName, CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Remoting.Builder.ProxyBase.InvokeAsyncV2(Int32 interfaceId, Int32 methodId, String methodName, IServiceRemotingRequestMessageBody requestMsgBodyValue, CancellationToken cancellationToken)
   at Microsoft.ServiceFabric.Services.Remoting.Builder.ProxyBase.ContinueWithResultV2[TRetval](Int32 interfaceId, Int32 methodId, Task`1 task)

1 个答案:

答案 0 :(得分:1)

正如预期的那样,Service Fabric必须关闭服务才能启动新版本,这将导致短暂的错误,就像您遇到的错误一样。

默认情况下,Remoting API已内置docs中的重试逻辑:

  

服务代理处理该服务的所有故障转移异常   创建的分区。如果存在,它将重新解析端点   故障转移异常(非临时异常)并重试该调用   具有正确的端点。重试故障转移的次数   例外是不确定的。如果发生临时异常,则代理   重试呼叫。

话虽如此,您不必添加额外的重试逻辑,也许您应该尝试调整OperationRetrySettings以便更好地处理这些重试。

如果不能解决问题,并且您仍想在代码中添加逻辑,则最简单的处理方法是使用过渡故障处理库,例如Polly,如下所示:

   var policy = Policy
                 .Handle<FabricServiceNotFoundException>()
                 .WaitAndRetry(new[]
                 {
                   TimeSpan.FromSeconds(1),
                   TimeSpan.FromSeconds(2),
                   TimeSpan.FromSeconds(3)
                 });

   policy.Execute(() => DoSomething());

在此示例中,您在重试之间进行了指数补偿,如果调用次数过多,我建议改为采用断路器方法。