Question

我对索引的工作方式感到有些困惑。如果使用密钥 a ， b 和 c 的文档填充数据库，则每个文档都具有随机值（ c ，它有递增值）

这是我使用的python代码：

from pymongo import MongoClient from random import Random r = Random() client = MongoClient("server") test_db = client.test fubar_col = test_db.fubar for i in range(100000): doc = {'a': r.randint(10000, 99999), 'b': r.randint(100000, 999999), 'c': i} fubar_col.insert(doc)

然后我创建了一个索引 {c：1}

现在，如果我执行

>db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}, {a: 1, c: 1}).sort({c: -1}).explain()

我得到了

{ "cursor" : "BtreeCursor c_1 reverse", "isMultiKey" : false, "n" : 24668, "nscannedObjects" : 100000, "nscanned" : 100000, "nscannedObjectsAllPlans" : 100869, "nscannedAllPlans" : 100869, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 478, "indexBounds" : { "c" : [ [ { "$maxElement" : 1 }, { "$minElement" : 1 } ] ] }, "server" : "nuclight.org:27017" }

请参阅，mongodb使用 c_1 索引，执行大约需要478毫秒。如果我指定我想使用哪个索引（通过提示（{c：1}））：

> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}, {a: 1, c: 1}).sort({c: -1}).hint({c:1}).explain()

仅需约167毫秒。为什么会这样？

这是指向fubar collection fubar.tgz
的json dump的链接
P.S。我多次执行这些查询，结果是相同的

Answer 1

explain迫使MongoDB重新评估所有查询计划。在“普通”查询中，将使用缓存的最快查询计划。来自the documentation（强调我的）：

explain()操作评估查询计划和报告的集合在获胜的查询计划上。在正常操作中查询优化器缓存获胜的查询计划并将它们用于类似的相关将来的查询。因此，MongoDB有时可能会选择查询缓存中的计划与使用的计划不同 explain()。

除非您确实需要为典型查询迭代整个结果集，否则您可能希望在查询中包含limit()。在您的特定示例中，使用limit(100)将返回BasicCursor使用explain时，而不是索引：

> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}).sort({c: -1}).hint({c:1}).limit(100).explain();
{
        "cursor" : "BtreeCursor c_1 reverse",
        "n" : 100,
        "nscanned" : 432,
        "nscannedAllPlans" : 432,
        "scanAndOrder" : false,
        "millis" : 3,
        "indexBounds" : {
                "c" : [[{"$maxElement" : 1}, {"$minElement" : 1}]]
        },
}
>
> db.fubar.find({'a': {$lt: 50000}, 'b': {$gt: 500000}}).sort({c: -1}).limit(100).explain();
{
        "cursor" : "BasicCursor",
        "n" : 100,
        "nscanned" : 431,
        "nscannedAllPlans" : 863,
        "scanAndOrder" : true,
        "millis" : 12,
        "indexBounds" : { },
}

请注意，这是一种有点病态的情况，因为使用索引并没有太大帮助（比较nscanned）。

为什么显式提示提供更好的性能？

1 个答案: