Question

我正在使用具有以下文档设计的数据库：

{
    'email':    'a@b.com',
    'credentials': [{
        'type':     'password',
        'content':  'hashedpassword'
    }, {
        'type':     'oauth2',
        'content':  'token'
    }]
}

我已将{credentials.type: 1, credentials.content: 1}编入索引。它已被正确拾取，但在50k文档集上表现不佳。

以下是指示查询计划的日志：

[conn73] command database.users command: find {
    find: "users",
    filter: {
        credentials.type: "type",
        credentials.content: "content"
    },
    limit: 1,
    batchSize: 1,
    singleBatch: true
}
planSummary: IXSCAN {
    credentials.type: 1,
    credentials.content: 1
}
keysExamined:20860
docsExamined:18109
cursorExhausted:1
keyUpdates:0
writeConflicts:0
numYields:163
nreturned:1
reslen:455
locks:{
    Global: {
        acquireCount: {
            r: 328
        }
    },
    Database: {
        acquireCount: {
            r: 164
        }
    },
    Collection: {
        acquireCount: {
            r: 164
        }
    }
}
protocol:op_query
331ms

我注意到我有大量的keysExamined和docsExamined。我知道mongodb能够将所有值放在数组中以构建该索引。为什么要扫描这么多按键？

我确实有很高的并发访问权限，但是只读。

以下是查询的解释结果：

> db.users.find({'credentials.type': 'abc', 'credentials.content': 'def'}).explain()
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "net.users",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "credentials.type" : {
                        "$eq" : "abc"
                    }
                },
                {
                    "credentials.content" : {
                        "$eq" : "def"
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "credentials.content" : {
                    "$eq" : "def"
                }
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "credentials.type" : 1,
                    "credentials.content" : 1
                },
                "indexName" : "credentials.type_1_credentials.content_1",
                "isMultiKey" : true,
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "forward",
                "indexBounds" : {
                    "credentials.type" : [
                        "[\"abc\", \"abc\"]"
                    ],
                    "credentials.content" : [
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "localhost",
        "port" : 27017,
        "version" : "3.2.11",
        "gitVersion" : "009580ad490190ba33d1c6253ebd8d91808923e4"
    },
    "ok" : 1
}

我正在运行mongodb v3.2.11。如何正确优化此查询？我应该改变文件设计吗？

Answer 1

您可以尝试将凭据分成不同的文档。

如：

{
    'email':    'a@b.com',
    'credentialType':     'password',
    'credentialContent':  'hashedpassword'
}

{
     'email':    'a@b.com',
     'credentialType':     'oauth2',
     'credentialContent':  'token'
}

并在credentialType和credentialContent上创建索引。

这样您将拥有更多文档但更清晰的索引。你的查询会运行得更快。因为它不需要处理obj的数组。

Answer 2

感谢Sergiu Zaharie的暗示，我能够重新审视指数问题。

事实证明，由于＆＃39; credentials.type＆＃39;都是相似的，＆＃39; credentials.content＆＃39;各方面都不一样，我应该将复合索引放在＆＃39; credentials.content＆＃39;第一

换句话说，{credentials.content: 1, credentials.type: 1}就是答案。

正确索引数组字段？

2 个答案: