多键索引:查询中数组元素的顺序

时间:2015-05-06 09:24:05

标签: mongodb mongodb-query

我遇到了一些我无法理解的奇怪行为。 给定以下'schema'的文档集合:

{
 tag : ["t:someTag", "A", "B", "C"]
 msg : "some message"
 timestamp : ISODate(...)
 someIntField: 1
}

标记在以元素“t:something”开头的数组中,后跟任意数量的字符串标记。 收集统计数据:

db.perf_multikey.stats()
{
        "ns" : "test.perf_multikey",
        "count" : 36239306,
        "size" : 22124848112,
        "avgObjSize" : 610,
        "storageSize" : 24330923904,
        "numExtents" : 32,
        "nindexes" : 4,
        "lastExtentSize" : 2146426864,
        "paddingFactor" : 1,
        "systemFlags" : 1,
        "userFlags" : 1,
        "totalIndexSize" : 17494579648,
        "indexSizes" : {
                "_id_" : 1177303120,
                "tag_1" : 12851094032,
                "timestamp_1" : 1800706768,
                "level_1" : 1665475728
        },
        "ok" : 1
}

我正在执行以下查询:

db.perf_multikey.find({tag: {$all:["t:a", "J"]}})

正如预期的那样,它会命中索引并返回几行:

db.perf_multikey.find({tag: {$all:["t:a", "J"]}}).explain()
{
        "cursor" : "BtreeCursor tag_1",
        "isMultiKey" : true,
        "n" : 6,
        "nscannedObjects" : 10,
        "nscanned" : 10,
        "nscannedObjectsAllPlans" : 10,
        "nscannedAllPlans" : 22,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 1,
        "nChunkSkips" : 0,
        "millis" : 7,
        "indexBounds" : {
                "tag" : [
                        [
                                "t:a",
                                "t:a"
                        ]
                ]
        },
        "server" : "somefancyserver:27017",
        "filterSet" : false
}

但是查询只是按标记数组

中的元素顺序不同
db.perf_multikey.find({tag: {$all:["J","t:a"]}})

似乎没有使用索引

db.perf_multikey.find({tag: {$all:["J","t:a"]}}).explain()
{
        "cursor" : "Complex Plan",
        "n" : 6,
        "nscannedObjects" : 0,
        "nscanned" : 7866684,
        "nscannedObjectsAllPlans" : 7827833,
        "nscannedAllPlans" : 15694517,
        "nYields" : 139716,
        "nChunkSkips" : 0,
        "millis" : 118102,
        "server" : "samefancyserver:27017",
        "filterSet" : false
}

我正在使用MongoDB 2.6.9 看到上述结果我很困惑MongoDB多键索引的工作原理。为什么使用数组的查询依赖于顺序?

修改

升级到MongoDB 3.0.2后,我重新生成了数据集(大小足以使索引不适合RAM)并重新运行测试。 不幸的是我仍然会遇到相同的结果(请注意 tag 字段是遵循某种'模式' - 数组的第一个元素是任意字符串,后跟标记的一些排列 - 来自有限的值的世界,比如说“A” - “J”)。

这些是我的结果:

快速闪电:

> db.perf_multikey.find({tag : {$all : ["a", "J"]}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.perf_multikey",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "tag" : {
                                                "$eq" : "a"
                                        }
                                },
                                {
                                        "tag" : {
                                                "$eq" : "J"
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "KEEP_MUTATIONS",
                        "inputStage" : {
                                "stage" : "FETCH",
                                "filter" : {
                                        "tag" : {
                                                "$eq" : "J"
                                        }
                                },
                                "inputStage" : {
                                        "stage" : "IXSCAN",
                                        "keyPattern" : {
                                                "tag" : 1
                                        },
                                        "indexName" : "tag_1",
                                        "isMultiKey" : true,
                                        "direction" : "forward",
                                        "indexBounds" : {
                                                "tag" : [
                                                        "[\"a\", \"a\"]"
                                                ]
                                        }
                                }
                        }
                },
                "rejectedPlans" : [
                        {
                                "stage" : "FETCH",
                                "inputStage" : {
                                        "stage" : "KEEP_MUTATIONS",
                                        "inputStage" : {
                                                "stage" : "AND_SORTED",
                                                "inputStages" : [
                                                        {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "tag" : 1
                                                                },
                                                                "indexName" : "tag_1",
                                                                "isMultiKey" : true,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "tag" : [
                                                                                "[\"a\", \"a\"]"
                                                                        ]
                                                                }
                                                        },
                                                        {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "tag" : 1
                                                                },
                                                                "indexName" : "tag_1",
                                                                "isMultiKey" : true,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "tag" : [
                                                                                "[\"J\", \"J\"]"
                                                                        ]
                                                                }
                                                        }
                                                ]
                                        }
                                }
                        }
                ]
        },
        "serverInfo" : {
                "host" : "fancyhost",
                "port" : 27017,
                "version" : "3.0.2",
                "gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
        },
        "ok" : 1
}

慢一点:

> db.perf_multikey.find({tag : {$all : ["J", "a"]}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.perf_multikey",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "tag" : {
                                                "$eq" : "J"
                                        }
                                },
                                {
                                        "tag" : {
                                                "$eq" : "a"
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "KEEP_MUTATIONS",
                                "inputStage" : {
                                        "stage" : "AND_SORTED",
                                        "inputStages" : [
                                                {
                                                        "stage" : "IXSCAN",
                                                        "keyPattern" : {
                                                                "tag" : 1
                                                        },
                                                        "indexName" : "tag_1",
                                                        "isMultiKey" : true,
                                                        "direction" : "forward",
                                                        "indexBounds" : {
                                                                "tag" : [
                                                                        "[\"J\", \"J\"]"
                                                                ]
                                                        }
                                                },
                                                {
                                                        "stage" : "IXSCAN",
                                                        "keyPattern" : {
                                                                "tag" : 1
                                                        },
                                                        "indexName" : "tag_1",
                                                        "isMultiKey" : true,
                                                        "direction" : "forward",
                                                        "indexBounds" : {
                                                                "tag" : [
                                                                        "[\"a\", \"a\"]"
                                                                ]
                                                        }
                                                }
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [
                        {
                                "stage" : "KEEP_MUTATIONS",
                                "inputStage" : {
                                        "stage" : "FETCH",
                                        "filter" : {
                                                "tag" : {
                                                        "$eq" : "a"
                                                }
                                        },
                                        "inputStage" : {
                                                "stage" : "IXSCAN",
                                                "keyPattern" : {
                                                        "tag" : 1
                                                },
                                                "indexName" : "tag_1",
                                                "isMultiKey" : true,
                                                "direction" : "forward",
                                                "indexBounds" : {
                                                        "tag" : [
                                                                "[\"J\", \"J\"]"
                                                        ]
                                                }
                                        }
                                }
                        }
                ]
        },
        "serverInfo" : {
                "host" : "fancyhost",
                "port" : 27017,
                "version" : "3.0.2",
                "gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
        },
        "ok" : 1
}

我虽然http://docs.mongodb.org/manual/reference/operator/query/all/#performance可能是答案。

毕竟,通过[“随机字符串”,“A”]查询使用“随机字符串”将潜在结果集缩小到非常小的尺寸,从而易于扫描(?或进一步遍历)。 另一方面,通过[“A”,“随机字符串”]的查询应该很慢,因为“A”将返回大量进一步扫描...但查询[“A”,“随机不存在的字符串”]是闪电般快速......这让我感到困惑。

1 个答案:

答案 0 :(得分:0)

我强烈建议升级。我在2.6.1和3.0.0上测试了这个,我没有得到这种行为。

例如,这是2.6.1:

> db.t.find({tags:{$all:['t', 'tags']}}).explain()
{
        "cursor" : "BtreeCursor tags_1",
        "isMultiKey" : true,
        "n" : 1,
        "nscannedObjects" : 1,
        "nscanned" : 1,
        "nscannedObjectsAllPlans" : 1,
        "nscannedAllPlans" : 3,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 7,
        "indexBounds" : {
                "tags" : [
                        [
                                "t",
                                "t"
                        ]
                ]
        },
        "server" : "ubuntu:27017",
        "filterSet" : false
}
> db.t.find({tags:{$all:['t', 'tags']}})
{ "_id" : ObjectId("5549f186450548aed9ad4273"), "tags" : [ "tags", "t" ] }

即使首先存在不存在的价值:

> db.t.find({tags:{$all:['f', 'tags']}}).explain()
{
        "cursor" : "BtreeCursor tags_1",
        "isMultiKey" : true,
        "n" : 0,
        "nscannedObjects" : 0,
        "nscanned" : 0,
        "nscannedObjectsAllPlans" : 0,
        "nscannedAllPlans" : 0,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "indexBounds" : {
                "tags" : [
                        [
                                "f",
                                "f"
                        ]
                ]
        },
        "server" : "ubuntu:27017",
        "filterSet" : false
}

在3.0.0上:

> db.t.find({tags:{$all:['g','t']}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.t",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "tags" : {
                                                "$eq" : "g"
                                        }
                                },
                                {
                                        "tags" : {
                                                "$eq" : "t"
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "KEEP_MUTATIONS",
                        "inputStage" : {
                                "stage" : "FETCH",
                                "filter" : {
                                        "tags" : {
                                                "$eq" : "t"
                                        }
                                },
                                "inputStage" : {
                                        "stage" : "IXSCAN",
                                        "keyPattern" : {
                                                "tags" : 1
                                        },
                                        "indexName" : "tags_1",
                                        "isMultiKey" : true,
                                        "direction" : "forward",
                                        "indexBounds" : {
                                                "tags" : [
                                                        "[\"g\", \"g\"]"
                                                ]
                                        }
                                }
                        }
                },
                "rejectedPlans" : [
                        {
                                "stage" : "FETCH",
                                "inputStage" : {
                                        "stage" : "KEEP_MUTATIONS",
                                        "inputStage" : {
                                                "stage" : "AND_SORTED",
                                                "inputStages" : [
                                                        {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "tags" : 1
                                                                },
                                                                "indexName" : "tags_1",
                                                                "isMultiKey" : true,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "tags" : [
                                                                                "[\"g\", \"g\"]"
                                                                        ]
                                                                }
                                                        },
                                                        {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "tags" : 1
                                                                },
                                                                "indexName" : "tags_1",
                                                                "isMultiKey" : true,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "tags" : [
                                                                                "[\"t\", \"t\"]"
                                                                        ]
                                                                }
                                                        }
                                                ]
                                        }
                                }
                        }
                ]
        },
        "serverInfo" : {
                "host" : "ip-172-30-0-35",
                "port" : 27017,
                "version" : "3.0.0",
                "gitVersion" : "a841fd6394365954886924a35076691b4d149168"
        },
        "ok" : 1
}

当然,我没有在2.6.9上测试,但我已经测试了越来越高的版本,我无法复制这种行为。