Mongodb查询执行需要很长时间

时间:2015-11-23 06:21:06

标签: mongodb mongodb-query aggregation-framework spring-mongo

下面是我的mongodb 3.0查询,执行需要很长时间(4秒以上),数据集只有430万个文档:

db.getCollection('TestingCollection').aggregate([ 
    { $match: { 
        myDate: { $gte: new Date(949384052490) }, 
        $and: [ 
            { 
                myDate: { $lte: new Date(1448257684431) }, 
                $and: [ { myId: 10 } ] 
            }
        ], 
        type: { $ne: "Contractor" } 
    }}, 
    { $project: { 
        retailerName: 1,
        unitSold: 1, 
        year: { $year: [ "$myDate" ] },
        currency: 1, 
        totalSales: { $multiply: [ "$unitSold", "$itemPrice" ] } 
    }}, 
    { $group: { 
        _id: { 
            retailerName: "$retailerName", 
            year: "$year",      
            currency: "$currency" 
        }, 
        netSales: { $sum: "$revenue" }, 
        netUnitSold: { $sum: "$unitSold" }, 
        totalSales: { $sum:"$totalSales" } 
    }}
] )

复合索引字段:

(myDate : 1, retailerName:1, type:1, myId:1).

相同的查询
type: { $eq: "Contractor" }

需要几毫秒才能执行。

请告诉我在哪里做错了。

1 个答案:

答案 0 :(得分:2)

“范围选择”指定不正确,您对$and的使用不正确。事实上,只考虑“最后”的论点,因此它只是在寻找“大于myId等于10的日期,这当然不是正确。

以下是$match的正确查询语法:

{ "$match": { 
    "myDate": { 
        "$gte": new Date(949384052490),
        "$lte": new Date(1448257684431)
    },
    "myId": 10,
    "type": { "$ne": "Contractor" }
}}

不需要任何$and,因为所有MongoDB查询参数都已经是 AND 条件。

您还应该考虑合并$project$group阶段,因为这通常意味着它们可以在它们一个接一个地出现时进行组合。至少它的效率更高。

但当然大部分时间都浪费在最初的$match上,无论如何都会选择不正确的结果。

$group$project的最佳渠道:

{ "$group": { 
    "_id": { 
        "retailerName": "$retailerName", 
        "year": { "$year": "$myDate" },      
        "currency": "$currency"
    }, 
    "netSales": { "$sum": "$revenue" }, 
    "netUnitSold": { "$sum": "$unitSold" }, 
    "totalSales": { "$sum": 
        { "$multiply": [ "$unitSold", "$itemPrice" ] }
    }
}}

所以整个管道现在只有$match然后$group

使用spring mongo

如果您正在使用spring-mongo,那么受支持的运算符与复合键和累加器中的计算值的组合$group存在当前限制,但您可以解决这些问题。关于$and语句,这实际上是语法问题,而不是spring mongo的错误。

首先在聚合管道中为“组”设置自定义类:

public class CustomGroupOperation implements AggregationOperation {
    private DBObject operation;

    public CustomGroupOperation (DBObject operation) {
        this.operation = operation;
    }

    @Override
    public DBObject toDBObject(AggregationOperationContext context) {
        return context.getMappedObject(operation);
    }
}

然后使用该类构建管道:

    Aggregation aggregation = newAggregation(
        match(
                Criteria.where("myDate")
                        .gte(new Date(new Long("949384052490")))
                        .lte(new Date(new Long("1448257684431")))
                        .and("myId").is(10)
                        .and("type").ne("Contractor")
        ),
        new CustomGroupOperation(
            new BasicDBObject(
                "$group", new BasicDBObject(
                    "_id", new BasicDBObject(
                        "retailerName", "$retailerName"
                    ).append(
                        "year", new BasicDBObject("$year", "$myDate")
                    ).append(
                        "currency", "$currency"
                    )
                ).append(
                    "netSales", new BasicDBObject("$sum","$revenue")
                ).append(
                    "netUnitSold", new BasicDBObject("$sum","$unitSold")
                ).append(
                    "totalSales", new BasicDBObject(
                        "$multiply", Arrays.asList("$unitSold", "$itemPrice")
                    )
                )
            )
        )
    );

生成这样的序列化管道:

[ 
    { "$match" : { 
        "myDate" : { 
            "$gte" : { "$date" : "2000-02-01T05:47:32.490Z"}, 
            "$lte" : { "$date" : "2015-11-23T05:48:04.431Z"}
        }, 
        "myId" : 10, 
        "type" : { "$ne" : "Contractor"}
    }}, 
    { "$group": { 
        "_id" : { 
            "retailerName" : "$retailerName", 
            "year" : { "$year" : "$myDate"}, 
            "currency" : "$currency"
        }, 
        "netSales" : { "$sum" : "$revenue"}, 
        "netUnitSold" : { "$sum" : "$unitSold"}, 
        "totalSales" : { "$multiply" : [ "$unitSold" , "$itemPrice"]}
    }}
]

这与上面给出的例子完全相同