MongoDB - 快速分组/聚合庞大的数据集

时间:2014-06-03 15:15:57

标签: mongodb mongodb-query aggregation-framework

我的MongoDb中的文档如下所示:

{
    "Property A" : X,
    "Property B" : ..,
    "Property C" : [
        {
            "Price" : "1",
            "SubPropertyB" : "x1",
        },
        {
            "Price" : "2",
            "SubPropertyB" : "x2"
        },
        {
            "Price" : "3",
            "SubPropertyB" : "x3",
        }  
    ]
},
{
    "Property A" : X
    "Property B" : ..,
    "Property C" :  [
        {
            "Price" : "4",
            "SubPropertyB" : "x4",
        },
        {
            "Price" : "5",
            "SubPropertyB" : "x5",
        },
        {
            "Price" : "6",
            "SubPropertyB" : "x6",
        }   
    ]
},
{
    "Property A" : Y,
    "Property B" : ..,
    "Property C" : [
        {
            "Price" : "1",
            "SubPropertyB" : "y1",
        },
        {
            "Price" : "2",
            "SubPropertyB" : "y2",
        },
        {
            "Price" : "3",
            "SubPropertyB" : "y3",
        }   
    ]
},
{
    "Property A" : Y,
    "Property B" : ..,
    "Property C" : [
        {
            "Price" : "4",
            "SubPropertyB" : "y4",
        },
        {
            "Price" : "5",
            "SubPropertyB" : "y5",
        },
        {
            "Price" : "6",
            "SubPropertyB" : "y6",
        }   
    ]
}

现在我想通过PropertyA对这些文件进行分组。在我的例子中,我会找到两组,每两个文件。在此之后,我必须以每个组的最低价格获得该文档。对于我的例子,我希望结果有两个文档,如下所示:

{
    "Property A" : X,
    "Property B" : ..,
    "Property C" :    
    "Price" : "1",
    "SubPropertyB" : "x1",
},
{
    "Property A" : y,
    "Property B" : ..,
    "Property C" :    
    "Price" : "1",
    "SubPropertyB" : "y1",
}

如何通过MongoDb中的聚合框架实现这一点(如何查看这样的查询),以便搜索执行速度非常快?

1 个答案:

答案 0 :(得分:0)

实际上,您想使用aggregation framework来分组并找到“最优惠的价格”:

db.collection.aggregate([
  // Match your documents first
  { "$match": { "PropertyB": "xyz" } },

  // Unwind the array to de-normalize as individual documents
  { "$unwind": "$PropertyC" },

  // Group by "PropertyA" with the max price
  { "$group": {
     "_id": "$PropertyA",
     "price": { "$max": "$PropertyC.Price" }
  }}

])

当然,索引的字段是“PropertyB”,因为您正在使用$match进行初始过滤。

不确定您对“最优价格”的定义是什么,但最高为$max,最低为$min。请注意,这些值应该是数字,因为词汇排序字符串可能会产生意外结果。

另请参阅Aggregation Operator Reference