Question

在CreateDocumentQuery中，我使用MaxItemCount，然后使用HasMoreResults和ExecuteNextAsync - 这已在其他帖子中描述。

我的问题是，有时候 - 特别是在对DocumentDB进行大量更新之后 - 循环遍历每个文档会产生一些随机结果，最多可忽略一半文档。

只有在查询设置中包含SQL查询时才会发生这种情况 - 因为我只需要处理一些字段/列。如果我允许所有字段都返回，它将100％正常工作。但这样效率很低，因为我只导出了几列而且有近百万条记录。

我需要使用C＃，因为它是与其他C＃模块链接的预定作业。

有没有人能够使用分页一直循环遍历大型集合？

下面的代码提取 - 包含sql - 如果我从查询中删除sql没有问题。

sql = "select d.field1, d.field2 from doc d";
var query = client.CreateDocumentQuery("dbs/" + database.Id + "/colls/" + documentCollection.Id, sql
            new FeedOptions { MaxItemCount = 1000 }
            ).AsDocumentQuery();

while (query.HasMoreResults)
{
    FeedResponse<Document> res;
    while (true)
    {
        try
        {
            res = await query.ExecuteNextAsync<Document>();
            break; // success!
        }
        catch (Exception ex)
        {
            if (ex.Message.IndexOf("request rate too large") > -1)
            {
                // DocumentDB is under pressure - wait a while and retry - this will resolve eventually
                System.Threading.Thread.Sleep(5000);
            }
            else
            {
                errorcount++;
                throw ex;
            }
        }
    }
    if (res.Any())
    {
        foreach (var liCurrent in res)
        {
            try
            {
                // Convert the Document to a CSV line item
                // DO THE FILE LINE CREATION HERE
                fileLineItem = "test";

                // Write the line to the file
                writer.WriteLine(fileLineItem);
            }
            catch (Exception ex)
            {
                errorcount++;
                throw ex;
            }
            totalrecords++;
        }
    }
}

Answer 1

Amab指出的解决方案是将一致性水平设置为一致。我以前做过这个 - 但是从那时起我删除并重新创建了这个集合。创建集合时，默认设置是惰性的。因此，您需要在创建时指定它或稍后更改它。

感谢Amab

Azure DocumentDB C＃SDK使用MaxItemCount并跳过记录

1 个答案: