我有一个Azure存储表,它有3k +记录。
删除表中所有行的最有效方法是什么?
答案 0 :(得分:19)
对于3000条记录,最简单的方法是delete the table。但请注意,当您删除表时,它当时不会被删除,而是被放入某种队列中删除,并且实际上会在一段时间后删除。此时间取决于系统上的负载+表中实体的数量。在此期间,您将无法重新创建此表或使用此表。
如果您继续使用该表很重要,则唯一的其他选择是删除实体。要获得更快的删除,您可以使用Entity Batch Transactions
查看删除实体。但是要删除实体,您需要先获取实体。您可以通过仅获取实体的PartitionKey
和RowKey
属性而不是获取所有属性来加速提取过程,因为删除实体只需要这两个属性。
答案 1 :(得分:1)
我用这样的东西。我们按日期划分密钥,您的情况可能有所不同:
async Task Main()
{
var startDate = new DateTime(2011, 1, 1);
var endDate = new DateTime(2012, 1, 1);
var account = CloudStorageAccount.Parse("connString");
var client = account.CreateCloudTableClient();
var table = client.GetTableReference("TableName");
var dates = Enumerable.Range(0, Math.Abs((startDate.Month - endDate.Month) + 12 * (startDate.Year - endDate.Year)))
.Select(offset => startDate.AddMonths(offset))
.ToList();
foreach (var date in dates)
{
var key = $"{date.ToShortDateString()}";
var query = $"(PartitionKey eq '{key}')";
var rangeQuery = new TableQuery<TableEntity>().Where(query);
var result = table.ExecuteQuery<TableEntity>(rangeQuery);
$"Deleting data from {date.ToShortDateString()}, key {key}, has {result.Count()} records.".Dump();
var allTasks = result.Select(async r =>
{
try
{
await table.ExecuteAsync(TableOperation.Delete(r));
}
catch (Exception e) { $"{r.RowKey} - {e.ToString()}".Dump(); }
});
await Task.WhenAll(allTasks);
}
}
答案 2 :(得分:0)
这取决于数据的结构,但是如果您可以对所有记录进行查询,则可以将每个记录添加到TableBatchOperation
中并立即执行它们。
下面是一个示例,该示例仅将所有结果存储在同一分区键中,并改编自How to get started with Azure Table storage and Visual Studio connected services。
// query all rows
CloudTable peopleTable = tableClient.GetTableReference("myTableName");
var query = new TableQuery<MyTableEntity>();
var result = await remindersTable.ExecuteQuerySegmentedAsync(query, null);
// Create the batch operation.
TableBatchOperation batchDeleteOperation = new TableBatchOperation();
foreach (var row in result)
{
batchDeleteOperation.Delete(row);
}
// Execute the batch operation.
await remindersTable.ExecuteBatchAsync(batchDeleteOperation);
答案 3 :(得分:0)
我使用以下功能首先将所有分区键放入队列,然后循环浏览该键以批量删除100条所有行。
Queue queue = new Queue();
queue.Enqueue("PartitionKeyTodelete1");
queue.Enqueue("PartitionKeyTodelete2");
queue.Enqueue("PartitionKeyTodelete3");
while (queue.Count > 0)
{
string partitionToDelete = (string)queue.Dequeue();
TableQuery<TableEntity> deleteQuery = new TableQuery<TableEntity>()
.Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionToDelete))
.Select(new string[] { "PartitionKey", "RowKey" });
TableContinuationToken continuationToken = null;
do
{
var tableQueryResult = await myTable.ExecuteQuerySegmentedAsync(deleteQuery, continuationToken);
continuationToken = tableQueryResult.ContinuationToken;
// Split into chunks of 100 for batching
List<List<TableEntity>> rowsChunked = tableQueryResult.Select((x, index) => new { Index = index, Value = x })
.Where(x => x.Value != null)
.GroupBy(x => x.Index / 100)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
// Delete each chunk of 100 in a batch
foreach (List<TableEntity> rows in rowsChunked)
{
TableBatchOperation tableBatchOperation = new TableBatchOperation();
rows.ForEach(x => tableBatchOperation.Add(TableOperation.Delete(x)));
await myTable.ExecuteBatchAsync(tableBatchOperation);
}
}
while (continuationToken != null);
}
答案 4 :(得分:0)
对于后来发现这个问题的人来说,接受的答案“刚刚删除了表”的问题在于,虽然它在存储模拟器中运行良好,但在生产中会随机失败。如果您的应用/服务需要定期重新生成表,那么您会发现由于冲突或删除仍在进行中而导致失败。
相反,我发现了删除分段查询中所有行的最快且最容易出错的 EF 友好方法。下面是我正在使用的一个简单的插入示例。传入您的客户端、表名和实现 ITableEntity 的类型。
private async Task DeleteAllRows<T>(string table, CloudTableClient client) where T: ITableEntity, new()
{
// query all rows
CloudTable tableref = client.GetTableReference(table);
var query = new TableQuery<T>();
TableContinuationToken token = null;
var result = await tableref.ExecuteQuerySegmentedAsync(query, token);
do
{
foreach (var row in result)
{
TableOperation.Delete(row);
}
} while (token != null);
}
示例用法:
table = client.GetTableReference("TodayPerformanceSnapshot");
created = await table.CreateIfNotExistsAsync();
if(!created)
{
// not created, table already existed, delete all content
await DeleteAllRows<TodayPerformanceContainer>("TodayPerformanceSnapshot", client);
log.Information("Azure Table:{Table} Purged", table);
}