Google Cloud Spanner中批量插入期间的瞬态故障异常

时间:2017-10-17 10:31:48

标签: c# google-cloud-platform google-cloud-spanner

我目前正在测试Google Cloud Spanner作为一个项目的MySQL的替代品,因为我预计随着时间的推移,行数将增长到数亿。数据库需要非常快速地响应并在几秒钟内返回查询结果,所以我想我会试试Spanner。

我尝试将示例数据批量加载到我的Spanner DB,但是我不断收到此错误:

  

未处理的异常:System.AggregateException:发生了一个或多个错误。 ---> Google.Cloud.Spanner.Data.SpannerException:操作已中止。 ---> Grpc.Core.RpcException:Status(StatusCode = Aborted,Detail =“由于瞬态故障而中止”)

我在这里使用了一个经过修改的代码:https://github.com/GoogleCloudPlatform/dotnet-docs-samples/blob/master/spanner/api/Program.cs

以下是InsertSampleData中的代码修改:

public static object InsertSampleData(string projectId,
        string instanceId, string databaseId)
    {
        // I get about 100k rows here
        List<Data> data = get_data();

        // how many runs I need to make if I split the data by 100 rows
        int rows = 100;
        double cnt = (double)data.Count / rows;
        cnt = Math.Ceiling(cnt);

        // process the data part by part
        for (int i = 0; i < cnt; i++)
        {
            // returns part of the data based on offset and amount
            List<Data> data_part = get_part(data, i, rows);

            var response = InsertTradesAsync(
            projectId, instanceId, databaseId, data_part);
            s_logger.Info("Waiting for operation to complete...");
            response.Wait();
            s_logger.Info($"Operation status: {response.Status}");
        }

        return ExitCode.Success;
    }

InsertTradesAsync与repo中的相同(当然参数列表除外)。

当我运行代码时,我总是会收到以下错误:

Unhandled Exception: System.AggregateException: One or more errors occurred. ---> Google.Cloud.Spanner.Data.SpannerException: The operation was aborted. ---> Grpc.Core.RpcException: Status(StatusCode=Aborted, Detail="Aborted due to transient fault")
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Google.Api.Gax.Grpc.ApiCallRetryExtensions.<>c__DisplayClass0_0`2.<<WithRetry>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.V1.Internal.ExecuteHelper.<WithSessionChecking>d__0`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.V1.TransactionPool.<RunFinalMethodAsync>d__9`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.Data.SpannerTransaction.<<CommitAsync>b__29_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.Data.ExecuteHelper.<WithErrorTranslationAndProfiling>d__2`1.MoveNext()
   --- End of inner exception stack trace ---
   at Google.Cloud.Spanner.Data.ExecuteHelper.<WithErrorTranslationAndProfiling>d__2`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.Data.EphemeralTransaction.<>c__DisplayClass2_0.<<ExecuteMutationsAsync>b__1>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.Data.ExecuteHelper.<WithErrorTranslationAndProfiling>d__2`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Google.Cloud.Spanner.Data.SpannerCommand.<ExecuteMutationsAsync>d__49.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at GoogleCloudSamples.Spanner.Program.<InsertTradesAsync>d__25.MoveNext() in c:\Users\user\Documents\Dev\spanner\api\Program.cs:line 1298
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at GoogleCloudSamples.Spanner.Program.InsertSampleData(String projectId, String instanceId, String databaseId) in c:\Users\user\Documents\Dev\spanner\api\Program.cs:line 1585
   at GoogleCloudSamples.Spanner.Program.<>c__DisplayClass45_0.<Main>b__2(InsertSampleDataOptions opts) in c:\Users\user\Documents\Dev\spanner\api\Program.cs:line 1932
   at CommandLine.ParserResultExtensions.MapResult[T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,TResult](ParserResult`1 result, Func`2 parsedFunc1, Func`2 parsedFunc2, Func`2 parsedFunc3, Func`2 parsedFunc4, Func`2 parsedFunc5, Func`2 parsedFunc6, Func`2 parsedFunc7, Func`2 parsedFunc8, Func`2 parsedFunc9, Func`2 parsedFunc10, Func`2 parsedFunc11, Func`2 parsedFunc12, Func`2 parsedFunc13, Func`2 parsedFunc14, Func`2 parsedFunc15, Func`2 parsedFunc16, Func`2 notParsedFunc)
   at GoogleCloudSamples.Spanner.Program.Main(String[] args) in c:\Users\user\Documents\Dev\spanner\api\Program.cs:line 1910

我认为这可能与限制和配额(https://cloud.google.com/spanner/quotas)有关,但是在我的表中插入不同数量的行后会抛出异常(它似乎是随机的,有时它会在68次运行后发生每行100行,然后是28x100,52x100等。该表有30列,PK由2列组成(没有索引),我按部分处理数据,所以我不认为我达到了极限。

如果我将cmd.CommandTimeout设置为一个非常高的数字,我会得到更高的插入行(大约400x100) - 我假设客户端库重用连接?但是我无法找到有关C#库的任何信息或者说它。即使插入了更多行,错误仍然会发生。

非常感谢任何帮助。

谢谢!

1 个答案:

答案 0 :(得分:0)

Google开发人员。 我有一些建议可能会对你有帮助。

首先,您应该使用单个事务并为其添加尽可能多的写入。看起来你一次做100个?但你可以做得更多。你可以看到这里可以批量编写多少笔的限制(我相信它是20,000):

https://github.com/GoogleCloudPlatform/dotnet-docs-samples/blob/master/spanner/api/Program.cs#L1242

第二条建议直接解决了您的问题。由于spanner的工作原理,您需要在事务周围使用重试,如下例所示:

https://github.com/GoogleCloudPlatform/dotnet-docs-samples/blob/master/spanner/api/Program.cs#L1259

(在此下载瞬态故障应用程序nuget): https://www.nuget.org/packages/EnterpriseLibrary.TransientFaultHandling/

您需要这样做,因为扳手偶尔会遇到死锁,迫使您完全重新运行您的交易。我们提供了一种扩展方法&#34; IsTransientSpannerFault&#34;在Exception上,可以更容易地构建重试策略,如下所示:

    internal class CustomTransientErrorDetectionStrategy
        : ITransientErrorDetectionStrategy
    {
        public bool IsTransient(Exception ex) =>
            ex.IsTransientSpannerFault();
    }

希望这有帮助!

编辑:我刚刚注意到你没有等待对InsertTradesAsync的异步调用的结果。您可能希望至少在最后执行Task.WaitAll。请注意,每批100个写入正在后台运行,可能与之前的批次并行。这很有可能增加了导致重试失败的可能性。

如果您这样做是为了提高性能,那么Spanner ADO.NET库在幕后的池中打开多个连接,这些连接等同于SpannerConnection.SpannerOptions.MaximumGrpcChannels。所以你会看到收益达到一定程度。您可以增加此值以调整性能。

问候