我正在尝试将10000条记录插入Azure表存储中。我使用ExecuteAsync()来实现它,但不知何故插入大约7500条记录,其余的记录丢失。我故意不使用等待关键字,因为我不想等待结果,只是想将它们存储在表中。以下是我的代码段。
private static async void ConfigureAzureStorageTable()
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
for (int i = 0; i < 10000; i++)
{
var verifyVariableEntityObject = new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
};
TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
try
{
table.ExecuteAsync(insertOperation);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
使用方法有什么不对吗?
答案 0 :(得分:4)
你仍然想要library(dplyr)
df %>% group_by(ID) %>% filter(1 %in% period)
#Source: local data frame [8 x 3]
#Groups: ID [3]
# ID period mass
# <dbl> <dbl> <dbl>
#1 1 1 7.622950
#2 1 2 7.960665
#3 1 3 5.045723
#4 2 1 4.366568
#5 2 2 4.400645
#6 2 3 6.088367
#7 4 1 2.282713
#8 4 3 2.461640
。这意味着await table.ExecuteAsync()
会在此时将控制权返回给调用者,这可以继续执行。
问题的方式,ConfigureAzureStorageTable()
将继续调用ConfigureAzureStorageTable()
并退出,table.ExecuteAsync()
之类的内容将超出范围,table
{1}}任务仍未完成。
关于在SO和其他地方使用table.ExecuteAsync()
有很多警告你还需要考虑。您可以轻松地将您的方法设置为async void
,但不要在调用者中等待它,但保留返回的async Task
以便彻底终止等等。
修改:一次添加 - 您几乎肯定希望在Task
使用ConfigureAwait(false)
,因为您似乎不需要保留任何上下文。这个blog post对此有一些指导和一般的异步。
答案 1 :(得分:1)
如何使用TableBatchOperation
一次运行批量的N个插入?
private const int BatchSize = 100;
private static async void ConfigureAzureStorageTable()
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
var batchOperation = new TableBatchOperation();
for (int i = 0; i < 10000; i++)
{
var verifyVariableEntityObject = new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
};
TableOperation insertOperation = TableOperation.Insert(verifyVariableEntityObject);
batchOperation.Add(insertOperation);
if (batchOperation.Count >= BatchSize)
{
try
{
await table.ExecuteBatchAsync(batchOperation);
batchOperation = new TableBatchOperation();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
if(batchOperation.Count > 0)
{
try
{
await table.ExecuteBatchAsync(batchOperation);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
您可以根据需要调整BatchSize。小免责声明:我没有试图运行它,虽然它应该有用。
但我无法解释为什么你的功能async void
?这应该保留给事件处理程序和类似的,你无法决定接口。在大多数情况下,您希望返回任务。因为现在调用者无法捕获此函数中发生的异常。
答案 2 :(得分:1)
根据您的要求,我已成功使用CloudTable.ExecuteAsync
和CloudTable.ExecuteBatchAsync
测试了您的方案。以下是关于使用CloudTable.ExecuteBatchAsync
将记录插入Azure表存储的代码片段,您可以参考它。
Program.cs Main
class Program
{
static void Main(string[] args)
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
TableResult result = new TableResult();
CloudTable table = tableClient.GetTableReference("test");
table.CreateIfNotExists();
//Generate records to be inserted into Azure Table Storage
var entities = Enumerable.Range(1, 10000).Select(i => new VerifyVariableEntity()
{
ConsumerId = String.Format("{0}", i),
Score = String.Format("{0}", i * 2 + 2),
PartitionKey = String.Format("{0}", i),
RowKey = String.Format("{0}", i * 2 + 2)
});
//Group records by PartitionKey and prepare for executing batch operations
var batches = TableBatchHelper<VerifyVariableEntity>.GetBatches(entities);
//Execute batch operations in parallel
Parallel.ForEach(batches, new ParallelOptions()
{
MaxDegreeOfParallelism = 5
}, (batchOperation) =>
{
try
{
table.ExecuteBatch(batchOperation);
Console.WriteLine("Writing {0} records", batchOperation.Count);
}
catch (Exception ex)
{
Console.WriteLine("ExecuteBatch throw a exception:" + ex.Message);
}
});
Console.WriteLine("Done!");
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
}
<强> TableBatchHelper.cs 强>
public class TableBatchHelper<T> where T : ITableEntity
{
const int batchMaxSize = 100;
public static IEnumerable<TableBatchOperation> GetBatches(IEnumerable<T> items)
{
var list = new List<TableBatchOperation>();
var partitionGroups = items.GroupBy(arg => arg.PartitionKey).ToArray();
foreach (var group in partitionGroups)
{
T[] groupList = group.ToArray();
int offSet = batchMaxSize;
T[] entities = groupList.Take(offSet).ToArray();
while (entities.Any())
{
var tableBatchOperation = new TableBatchOperation();
foreach (var entity in entities)
{
tableBatchOperation.Add(TableOperation.InsertOrReplace(entity));
}
list.Add(tableBatchOperation);
entities = groupList.Skip(offSet).Take(batchMaxSize).ToArray();
offSet += batchMaxSize;
}
}
return list;
}
}
注意:正如官方document中提到的关于插入一批实体的内容:
单个批处理操作最多可包含 100 个实体。
单个批处理操作中的所有实体必须具有相同的分区键。
总之,请尝试检查它是否适合您。此外,您可以捕获控制台应用程序中的详细异常,并通过Fiddler捕获HTTP请求,以便在将记录插入Azure表存储时捕获HTTP错误请求。
答案 3 :(得分:0)
async void不是一个好习惯,除非它是一个事件处理程序。
https://msdn.microsoft.com/en-us/magazine/jj991977.aspx
如果您计划在azure表存储中插入许多记录,那么批量插入是您最好的选择。
请记住,每批次限制为100次。
答案 4 :(得分:0)
我遇到了同样的问题,并通过 强制ExecuteAsync等待结果出现。.
table.ExecuteAsync(insertOperation).GetAwaiter().GetResult()