如何在MongoDB中对每组内的max对应的文档进行分组和选择?

时间:2015-12-21 21:43:52

标签: mongodb sorting max aggregation-framework

这是我的mongo系列'sales':

{"title":"Foo", "hash": 17, "num_sold": 49, 
"place": "ABC"}

{"title":"Bar", "hash": 18, "num_sold": 55, 
"place": "CDF"}

{"title":"Baz", "hash": 17, "num_sold": 55,
"place": "JKN"}

{"title":"Spam", "hash": 17, "num_sold": 20,
"place": "ZSD"}

{"title":"Eggs", "hash": 18, "num_sold": 20, 
"place": "ZDF"}

我想通过散列分组并返回最大的“num_sold”文档。所以作为输出我希望看到:

{"title":"Baz", "hash": 17, "num_sold": 55,
    "place": "JKN"}

 {"title":"Bar", "hash": 18, "num_sold": 55, 
    "place": "CDF"}

我知道聚合运算符的基本原理,这里是我如何分组并获得num_sold的最大值,但我需要整个文档对应最大值,而不仅仅是值。

db.getCollection('sales').aggregate([
{$group: {_id: "$hash", max_sold : {$max: '$value'}}}
])

在SQL中我会用join来完成它,但是在mongo中。我还读到,在mongo组中,排序并不能很好地协同工作。

2 个答案:

答案 0 :(得分:3)

您可以使用$redact阶段来完成此任务。它避免使用$sort,然后再次使用$group$unwind

  • $group _id并获得每个组的最大max_num_sold,使用$push运算符累积论坛中的所有文档。
  • $redact到每个小组的子文档中,只保留max_num_sold
  • 中最大num_sold的文档

示例代码:

db.getCollection('sales').aggregate([
{$group:{"_id":"$hash",
         "max_num_sold":{$max:"$num_sold"},
         "records":{$push:"$$ROOT"}}},
{$redact:{$cond:[{$eq:[{$ifNull:["$num_sold","$$ROOT.max_num_sold"]},
                       "$$ROOT.max_num_sold"]},
                "$$DESCEND","$$PRUNE"]}},
])

测试数据:

db.getCollection('sales').insert([
{"title":"Foo","hash":17,"num_sold":49,"place":"ABC"},
{"title":"Bar","hash":18,"num_sold":55,"place":"CDF"},
{"title":"Baz","hash":17,"num_sold":55,"place":"JKN"},
{"title":"Spam","hash":17,"num_sold":20,"place":"ZSD"},
{"title":"Eggs","hash":18,"num_sold":20,"place":"ZDF"}
])

测试结果:

{
        "_id" : 18,
        "max_num_sold" : 55,
        "records" : [
                {
                        "_id" : ObjectId("567874f2b506fc2193a22696"),
                        "title" : "Bar",
                        "hash" : 18,
                        "num_sold" : 55,
                        "place" : "CDF"
                }
        ]
}
{
        "_id" : 17,
        "max_num_sold" : 55,
        "records" : [
                {
                        "_id" : ObjectId("567874f2b506fc2193a22697"),
                        "title" : "Baz",
                        "hash" : 17,
                        "num_sold" : 55,
                        "place" : "JKN"
                }
        ]
}

答案 1 :(得分:0)

看起来mongodb中的分组不会扭曲顺序,这样的事情是可能的:

mongodb, how to aggregate with group by and sort correctly

特别是,对于上面的例子,我们可以得到以下结果:

db.getCollection('sales').aggregate([
{$sort: {"num_sold":-1}},
{$group:{"_id": "$hash",
         "max_num_sold" : {$first:"$num_sold"},
         "title":{$first: "$title"},
         "place":{$first:"$place"}
         }}
])

这是输出:

{
    "result" : [ 
        {
            "_id" : 17.0000000000000000,
            "max_num_sold" : 55.0000000000000000,
            "title" : "Baz",
            "place" : "JKN"
        }, 
        {
            "_id" : 18.0000000000000000,
            "max_num_sold" : 55.0000000000000000,
            "title" : "Bar",
            "place" : "CDF"
        }
    ],
    "ok" : 1.0000000000000000
}