MongoDB计数非常慢

时间:2017-08-02 11:42:41

标签: mongodb

我有一个收藏产品,里面有~7.000.000本书,共有~40GB mongodb 3.4数据库。以下是一本书籍文档的示例:

{ 
    "_id" : ObjectId("597f17d22be7925d9a056e82"), 
    "ean13" : "9783891491904", 
    "price" : NumberInt(2100), 
    "name" : "My cool title", 
    "author_name" : "Doe, John", 
    "warengruppe" : "HC", 
    "book_category_key" : "728",
    "keywords": ["fairy tale", "magic", "fantasy"]
    ...
}

当我用限制查询数据库时,时机正常。但如果我计算查询(用于分页),则需要很长时间:

2017-08-02T13:03:16.088 + 0200 I COMMAND [conn74]命令mydb.products命令:count {count:" products",query:{book_category_key:{$ in:[&# 34; 120"," 130"," 180"," 111"," 112"," 140&#34 ;," 150"," 160"," 170"," 190"," 1AA" ]},readConcern:{}} planSummary:IXSCAN {book_category_key:1} keysExamined:1129826 docsExamined:1129825 numYields:8851 reslen:44 locks:{Global:{acquireCount:{r:17704}},Database:{acquireCount:{ r:8852}},Collection:{acquireCount:{r:8852}}} protocol:op_query 7008ms

这是一个很好的形式的查询:

{
    count: "products",
    query: {
        book_category_key: {
            $in: ["120",
            "130",
            "180",
            "111",
            "112",
            "140",
            "150",
            "160",
            "170",
            "190",
            "1AA"]
        }
}

这需要7秒,有时甚至更长(最多20秒)。我在book_category_key上有一个索引:

{ 
    "v" : 2, 
    "name" : "book_category_key_1", 
    "ns" : "mydb.products", 
    "background" : true
}

1 个答案:

答案 0 :(得分:3)

问题在于planSummary: IXSCAN。当count使用IXSCAN时,它也会执行FETCH。像这样:

"planSummary" : "IXSCAN { book_category_key: 1 }",
"execStats" : {
    "stage" : "COUNT",
    ..... 
    "inputStage" : {
        "stage" : "FETCH",
        ....
        "inputStage" : {
            "stage" : "IXSCAN",
            .....

在你的情况下加载大约1/7的整个系列。

您可以对https://jira.mongodb.org/browse/SERVER-17266和相关问题进行投票,并使用建议的解决方法强制使用COUNT_SCAN:

let cnt = 0;
for(let category of ["120",
    "130",
    "180",
    "111",
    "112",
    "140",
    "150",
    "160",
    "170",
    "190",
    "1AA"]) { cnt += db.g.count({book_category_key: category})};
print(cnt);

哪个

"planSummary" : "COUNT_SCAN { book_category_key: 1 }",
"execStats" : {
    "stage" : "COUNT",
    ...
    "inputStage" : {
        "stage" : "COUNT_SCAN"
        ....

对于每个类别,如果索引适合内存,应该快〜10倍。