实际问题

Question

我是MongoDB的新手，虽然我还没能找到我所看到的解释。

当我运行以下查询时，我有一个大约200个文档的小数据集： db.tweets.find({user:22438186}) 我在 n / nscannedObjects / nscanned / nscannedObjectsAllPlans / nscannedAllPlans > 9 即可。光标是 BtreeCursor user_1 。一切都好。

介绍Sort（）

如果我对查询添加排序： db.tweets.find({user:22438186}).sort({created_at:1}) nscannedObjectsAllPlans / nscannedAllPlans 已增加到 30 。我可以在 allPlans 字段下看到：

[
{
    "cursor" : "BtreeCursor user_1",
    "isMultiKey" : false,
    "n" : 9,
    "nscannedObjects" : 9,
    "nscanned" : 9,
    "scanAndOrder" : true,
    "indexOnly" : false,
    "nChunkSkips" : 0,
    "indexBounds" : {
        "user" : [ 
            [ 
                22438186, 
                22438186
            ]
        ]
    }
},
{
    "cursor" : "BtreeCursor created_at_1",
    "isMultiKey" : false,
    "n" : 2,
    "nscannedObjects" : 21,
    "nscanned" : 21,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nChunkSkips" : 0,
    "indexBounds" : {
        "created_at" : [ 
            [ 
                {
                    "$minElement" : 1
                }, 
                {
                    "$maxElement" : 1
                }
            ]
        ]
    }
}
]

BtreeCursor created_at_1 扫描了21个文件并匹配2个？我不确定这里发生了什么，因为我认为sort()已应用于find()返回的文档，该文档似乎是 user_1 索引中的9。在撰写本文时，我是从 allPlans 字段收集的，因为某些原因，它还使用了 created_at_1 索引。

限制（＆gt; n）与Sort（）==复制光标＆amp;组合文件扫描？

当我追加limit(10)或更高版本时， n 仍为9， nscannedObjects / nscanned 均为 18 和 nscannedObjectsAllPlans / nscannedAllPlans 现在返回 60 。 为什么 n 的所有内容都加倍了？ 游标现在是 QueryOptimizerCursor ，有一个子句explain(true)结果中的em>字段，两个子对象完全相同，两次使用相同的游标导致重复？这种行为是否正常？

{ "cursor" : "BtreeCursor user_1", "isMultiKey" : false, "n" : 9, "nscannedObjects" : 9, "nscanned" : 9, "scanAndOrder" : true, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "user" : [ [ 22438186, 22438186 ] ] } }

我尝试了一些不同的限制值，并注意到使用限制为9， nscannedObjects / nscanned 都返回值9和 nscannedObjectsAllPlans / nscannedAllPlans 下拉到 29 ，在减少限制时递减1。

在子句中，第二个子对象与10和更高的限制查询不同。由于某种原因，游标字段现在显示 BtreeCursor 省略 user_1 ，所有 n 字段的值都为 0 而不是9，除了对象的其余部分是相同的。对于所有这些限制查询， allPlans 字段列出子句字段，另一个用于 BtreeCursor created_at_1 （用作具有限制的查询的游标） 1）。

实际问题

那么，当limit()和sort()都使用find()时，究竟是什么导致我的文档被扫描两次？如果限制超出 nscannedObjects 或 nscanned ，则似乎只会出现此问题。仅使用limit()或sort()文档进行查询时，不会扫描两次。

更新

对于混淆，很抱歉，第一个代码块在 allPlans 字段下显示光标数据。使用的实际游标是*BtreeCursor user_1*。

第二个代码块来自limit()和sort()的查询。我提供子句下列出的游标数据，子句字段列出相同的游标信息两次（重复）。该查询的实际游标字段为*QueryOptimizerCursor*。子句下的重复游标是*BtreeCursor user_1*。

我添加了复合索引{user：1，created_at：1}， n 字段的结果为9， nAllPlans 18。无论如何使用limit()的{{1}}值或用法。出于某些原因，在 allPlans 下，我的原始 user_id_1 索引仍然与新的复合索引一起运行。如果对查询应用限制而不是使用索引 user_id_1 / BtreeCursor user_1 ，则 QueryOptimizerCursor 与子句中的两个游标正在使用中。

我一直在进一步研究这个问题，似乎查询计划程序并行使用其他索引并选择最佳索引结果？我不确定每次执行此查询时，竞争对手是什么？再次发生或者如果它被缓存。

sort() 运行不带复合索引的查询会产生以下结果：

db.tweets.find({user:22438186}).sort({created_at:1}).limit(10)

使用复合索引：

{ "clauses" : [ { "cursor" : "BtreeCursor user_1", "isMultiKey" : false, "n" : 9, "nscannedObjects" : 9, "nscanned" : 9, "scanAndOrder" : true, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "user" : [ [ 22438186, 22438186 ] ] } }, { "cursor" : "BtreeCursor user_1", "isMultiKey" : false, "n" : 9, "nscannedObjects" : 9, "nscanned" : 9, "scanAndOrder" : true, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "user" : [ [ 22438186, 22438186 ] ] } } ], "cursor" : "QueryOptimizerCursor", "n" : 9, "nscannedObjects" : 18, "nscanned" : 18, "nscannedObjectsAllPlans" : 60, "nscannedAllPlans" : 60, "scanAndOrder" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "allPlans" : [ { "clauses" : [ { "cursor" : "BtreeCursor user_1", "isMultiKey" : false, "n" : 9, "nscannedObjects" : 9, "nscanned" : 9, "scanAndOrder" : true, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "user" : [ [ 22438186, 22438186 ] ] } }, { "cursor" : "BtreeCursor user_1", "isMultiKey" : false, "n" : 9, "nscannedObjects" : 9, "nscanned" : 9, "scanAndOrder" : true, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "user" : [ [ 22438186, 22438186 ] ] } } ], "cursor" : "QueryOptimizerCursor", "n" : 9, "nscannedObjects" : 18, "nscanned" : 18, "scanAndOrder" : false, "nChunkSkips" : 0 }, { "cursor" : "BtreeCursor created_at_1", "isMultiKey" : false, "n" : 3, "nscannedObjects" : 42, "nscanned" : 42, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "created_at" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] } } ], "server" : "HOME-PC:27017", "filterSet" : false, "stats" : { "type" : "KEEP_MUTATIONS", "works" : 43, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 32, "needFetch" : 0, "isEOF" : 1, "children" : [ { "type" : "OR", "works" : 42, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 32, "needFetch" : 0, "isEOF" : 1, "dupsTested" : 18, "dupsDropped" : 9, "locsForgotten" : 0, "matchTested_0" : 0, "matchTested_1" : 0, "children" : [ { "type" : "SORT", "works" : 21, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 10, "needFetch" : 0, "isEOF" : 1, "forcedFetches" : 0, "memUsage" : 6273, "memLimit" : 33554432, "children" : [ { "type" : "FETCH", "works" : 10, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "alreadyHasObj" : 0, "forcedFetches" : 0, "matchTested" : 0, "children" : [ { "type" : "IXSCAN", "works" : 10, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "keyPattern" : "{ user: 1 }", "isMultiKey" : 0, "boundsVerbose" : "field #0['user']: [22438186.0, 22438186.0]", "yieldMovedCursor" : 0, "dupsTested" : 0, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0, "keysExamined" : 9, "children" : [] } ] } ] }, { "type" : "SORT", "works" : 21, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 10, "needFetch" : 0, "isEOF" : 1, "forcedFetches" : 0, "memUsage" : 6273, "memLimit" : 33554432, "children" : [ { "type" : "FETCH", "works" : 10, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "alreadyHasObj" : 0, "forcedFetches" : 0, "matchTested" : 0, "children" : [ { "type" : "IXSCAN", "works" : 10, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 9, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "keyPattern" : "{ user: 1 }", "isMultiKey" : 0, "boundsVerbose" : "field #0['user']: [22438186.0, 22438186.0]", "yieldMovedCursor" : 0, "dupsTested" : 0, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0, "keysExamined" : 9, "children" : [] } ] } ] } ] } ] } }

希望能够消除困惑。

Answer 1

如果您看到explain()计划，则可以看到：

db.tweets.find({user:22438186})

使用user_1索引。

db.tweets.find({user:22438186}).sort({created_at:1})使用created_at_1索引。

这表明mongodb选择了created_at_1而不是user_1，因为排序操作在使用索引时性能更好，排序操作基于created_at字段。这使mongodb忽略user_1索引并执行full collection scan。

因此，我们需要在这些情况下仔细定义indexes。如果我们在user_1和created_at_1都有复合索引，则不会进行全表扫描，mongodb将选择支持find和sort操作的索引，以防万一是复合指数。

JIRA有一个很好的解释，为什么mongoDB使用QueryOptimizerCursor光标。

nscannedObjectsAllPlans / nscannedAllPlans下拉到29

你不应该担心这两个参数，它们代表了mongodb为执行选择合适索引而执行的所有计划所进行的组合扫描。

nscannedObjectsAllPlans是一个反映总数的数字   在数据库操作期间扫描所有查询计划的文档

nscannedAllPlans是一个反映总数的数字   在此期间扫描所有查询计划的文档或索引条目   数据库操作。

这些行来自docs。

那么，当在find（）中使用limit（）和sort（）时，究竟是什么导致我的文档被扫描两次？

如上所述，文档不会被扫描两次，它们由mongodb执行的两个不同计划并行scanned以选择适当的索引。如果您有两个不同的索引，则可以并行运行两个计划。依此类推。

使用limit（）+ sort（）时，MongoDB find（）查询扫描文档两次（使用了重复的游标）？

介绍Sort（）

限制（＆gt; n）与Sort（）==复制光标＆amp;组合文件扫描？

实际问题

更新

1 个答案: