MonogDB文档结构:用于元素聚合的映射与数组

时间:2020-06-09 17:31:25

标签: mongodb aggregation-framework elementwise-operations

我们要在MondoDB中存储某个类别(例如城市)的指标(例如销售,利润)的等级。等级表示例:[红色,黄色,绿色],长度将是固定的。我们正在考虑以下两个文档结构:

Structure 1: Ratings as an array
{
    "_id": 1,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : [1, 2, 3],  // frequency of RED, YELLOW, GREEN ratings, fixed length array
        "profit": [4, 5, 6],
    },
}
{
    "_id": 2,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : [1, 2, 3],  // frequency of RED, YELLOW, GREEN ratings, fixed length array
        "profit": [4, 5, 6],
    },
}

Structure 2: Ratings as a map
{
    "_id": 1,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : {             // map will always have "RED", "YELLOW", "GREEN" keys
            "RED": 1,
            "YELLOW": 2,
            "GREEN": 3
        },
        "profit" : {
            "RED":4,
            "YELLOW": 5,
            "GREEN": 6
        },
    },
}
{
    "_id": 2,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : {             // map will always have "RED", "YELLOW", "GREEN" keys
            "RED": 1,
            "YELLOW": 2,
            "GREEN": 3
        },
        "profit" : {
            "RED":4,
            "YELLOW": 5,
            "GREEN": 6
        },
    },
}

我们的用例:

  1. 按城市和指标分组的综合评分
  2. 我们不打算在“评分”字段上建立索引

因此,对于结构1,要汇总评分,我需要逐个元素汇总,并且似乎可能涉及展开步骤或map-reduce,最终文档看起来像这样:

{
    "city": "X",
    "sales": [2, 4, 6]
    "profit": [8, 10, 12]
}

对于结构2,我认为使用汇总管道(例如汇总销售额)的汇总将相对简单:

db.getCollection('Collection').aggregate([
    {
        $group: {
            "_id": {"city": "$city" },
            "sales_RED": {$sum: "$ratings.sales.RED"},
            "sales_YELLOW": {$sum: "$ratings.sales.YELLOW"},
            "sales_GREEN": {$sum: "$ratings.sales.GREEN"}
       }
    },
    {
        $project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
    }
])

将给出以下结果:

{
    "city": "X",
    "sales": [2, 4, 6]
}

查询: 我倾向于第二种结构,主要是因为我不清楚如何在MOngoDB中实现元素级数组聚合。从我所见,这可能涉及放松。由于等级的重复字段名称,第二个文档结构将具有较大的文档大小,但是聚合本身很简单。根据我们的用例,您能否指出它们在计算效率方面如何进行比较?如果我遗漏了任何值得考虑的要点?

1 个答案:

答案 0 :(得分:1)

我能够使用$ arrayElemAt实现数组结构的聚合。 (但是,这仍然涉及必须为单个数组元素指定聚合,这与文档结构2的情况相同)。

db.getCollection('Collection').aggregate([
    {
        $group: {
            "_id": {"city": "$city" },
            "sales_RED": {$sum: { $arrayElemAt: [ "$ratings.sales", 0] }},
            "sales_YELLOW": {$sum: { $arrayElemAt: [ "$ratings.sales", 1] }},
            "sales_GREEN": {$sum: { $arrayElemAt: [ "$ratings.sales", 2] }},
       }
    },
    {
        $project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
    }
])