Azure表存储的ExecuteAsync()无法插入所有记录

时间:2016-12-05 17:33:06

标签: c# azure async-await azure-table-storage

我正在尝试将10000条记录插入Azure表存储中。我使用ExecuteAsync()来实现它,但不知何故插入大约7500条记录,其余的记录丢失。我故意不使用等待关键字,因为我不想等待结果,只是想将它们存储在表中。以下是我的代码段。

private static async void ConfigureAzureStorageTable()
    {
        CloudStorageAccount storageAccount =
            CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
        TableResult result = new TableResult();
        CloudTable table = tableClient.GetTableReference("test");
        table.CreateIfNotExists();

        for (int i = 0; i < 10000; i++)
        {
            var verifyVariableEntityObject = new VerifyVariableEntity()
            {
                ConsumerId = String.Format("{0}", i),
                Score = String.Format("{0}", i * 2 + 2),
                PartitionKey = String.Format("{0}", i),
                RowKey = String.Format("{0}", i * 2 + 2)
            };
            TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
            try
            {
                table.ExecuteAsync(insertOperation);
            }
            catch (Exception e)
            {

                Console.WriteLine(e.Message);
            }
        }
    }

使用方法有什么不对吗?

5 个答案:

答案 0 :(得分:4)

仍然想要library(dplyr) df %>% group_by(ID) %>% filter(1 %in% period) #Source: local data frame [8 x 3] #Groups: ID [3] # ID period mass # <dbl> <dbl> <dbl> #1 1 1 7.622950 #2 1 2 7.960665 #3 1 3 5.045723 #4 2 1 4.366568 #5 2 2 4.400645 #6 2 3 6.088367 #7 4 1 2.282713 #8 4 3 2.461640 。这意味着await table.ExecuteAsync()会在此时将控制权返回给调用者,这可以继续执行。

问题的方式,ConfigureAzureStorageTable()将继续调用ConfigureAzureStorageTable()并退出,table.ExecuteAsync()之类的内容将超出范围,table {1}}任务仍未完成。

关于在SO和其他地方使用table.ExecuteAsync()有很多警告你还需要考虑。您可以轻松地将您的方法设置为async void,但不要在调用者中等待它,但保留返回的async Task以便彻底终止等等。

修改:一次添加 - 您几乎肯定希望在Task使用ConfigureAwait(false),因为您似乎不需要保留任何上下文。这个blog post对此有一些指导和一般的异步。

答案 1 :(得分:1)

如何使用TableBatchOperation一次运行批量的N个插入?

private const int BatchSize = 100;

private static async void ConfigureAzureStorageTable()
{
    CloudStorageAccount storageAccount =
        CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
    CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
    TableResult result = new TableResult();
    CloudTable table = tableClient.GetTableReference("test");
    table.CreateIfNotExists();

    var batchOperation = new TableBatchOperation();

    for (int i = 0; i < 10000; i++)
    {
        var verifyVariableEntityObject = new VerifyVariableEntity()
        {
            ConsumerId = String.Format("{0}", i),
            Score = String.Format("{0}", i * 2 + 2),
            PartitionKey = String.Format("{0}", i),
            RowKey = String.Format("{0}", i * 2 + 2)
        };
        TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
        batchOperation.Add(insertOperation);

        if (batchOperation.Count >= BatchSize)
        {
            try
            {
                await table.ExecuteBatchAsync(batchOperation);
                batchOperation = new TableBatchOperation();
            }
            catch (Exception e)
            {
                Console.WriteLine(e.Message);
            }
        }
    }

    if(batchOperation.Count > 0)
    {
        try
        {
            await table.ExecuteBatchAsync(batchOperation);
        }
        catch (Exception e)
        {
            Console.WriteLine(e.Message);
        }
    }
}

您可以根据需要调整BatchSize。小免责声明:我没有试图运行它,虽然它应该有用。

但我无法解释为什么你的功能async void?这应该保留给事件处理程序和类似的,你无法决定接口。在大多数情况下,您希望返回任务。因为现在调用者无法捕获此函数中发生的异常。

答案 2 :(得分:1)

根据您的要求,我已成功使用CloudTable.ExecuteAsyncCloudTable.ExecuteBatchAsync测试了您的方案。以下是关于使用CloudTable.ExecuteBatchAsync将记录插入Azure表存储的代码片段,您可以参考它。

Program.cs Main

class Program
{
    static void Main(string[] args)
    {
        CloudStorageAccount storageAccount =
            CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
        TableResult result = new TableResult();
        CloudTable table = tableClient.GetTableReference("test");
        table.CreateIfNotExists();

        //Generate records to be inserted into Azure Table Storage
        var entities = Enumerable.Range(1, 10000).Select(i => new VerifyVariableEntity()
        {
            ConsumerId = String.Format("{0}", i),
            Score = String.Format("{0}", i * 2 + 2),
            PartitionKey = String.Format("{0}", i),
            RowKey = String.Format("{0}", i * 2 + 2)
        });

        //Group records by PartitionKey and prepare for executing batch operations
        var batches = TableBatchHelper<VerifyVariableEntity>.GetBatches(entities);

        //Execute batch operations in parallel
        Parallel.ForEach(batches, new ParallelOptions()
        {
            MaxDegreeOfParallelism = 5
        }, (batchOperation) =>
        {
            try
            {
                table.ExecuteBatch(batchOperation);
                Console.WriteLine("Writing {0} records", batchOperation.Count);
            }
            catch (Exception ex)
            {
                Console.WriteLine("ExecuteBatch throw a exception:" + ex.Message);
            }
        });
        Console.WriteLine("Done!");
        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}

<强> TableBatchHelper.cs

public class TableBatchHelper<T> where T : ITableEntity
{
    const int batchMaxSize = 100;

    public static IEnumerable<TableBatchOperation> GetBatches(IEnumerable<T> items)
    {
        var list = new List<TableBatchOperation>();
        var partitionGroups = items.GroupBy(arg => arg.PartitionKey).ToArray();
        foreach (var group in partitionGroups)
        {
            T[] groupList = group.ToArray();
            int offSet = batchMaxSize;
            T[] entities = groupList.Take(offSet).ToArray();
            while (entities.Any())
            {
                var tableBatchOperation = new TableBatchOperation();
                foreach (var entity in entities)
                {
                    tableBatchOperation.Add(TableOperation.InsertOrReplace(entity));
                }
                list.Add(tableBatchOperation);
                entities = groupList.Skip(offSet).Take(batchMaxSize).ToArray();
                offSet += batchMaxSize;
            }
        }
        return list;
    }
}

注意:正如官方document中提到的关于插入一批实体的内容:

  

单个批处理操作最多可包含 100 个实体。

     

单个批处理操作中的所有实体必须具有相同的分区键

总之,请尝试检查它是否适合您。此外,您可以捕获控制台应用程序中的详细异常,并通过Fiddler捕获HTTP请求,以便在将记录插入Azure表存储时捕获HTTP错误请求。

答案 3 :(得分:0)

async void不是一个好习惯,除非它是一个事件处理程序。

https://msdn.microsoft.com/en-us/magazine/jj991977.aspx

如果您计划在azure表存储中插入许多记录,那么批量插入是您最好的选择。

https://msdn.microsoft.com/en-us/library/azure/microsoft.windowsazure.storage.table.tablebatchoperation.aspx

请记住,每批次限制为100次。

答案 4 :(得分:0)

我遇到了同样的问题,并通过 强制ExecuteAsync等待结果出现。.

table.ExecuteAsync(insertOperation).GetAwaiter().GetResult()