Sum array number values across multiple documents

时间:2018-04-20 00:54:23

标签: mongodb aggregation-framework

I have Mongo documents which have array number values in order (it's by day) and I want to sum the same values across multiple documents for each position grouped by field outside of the array.

{"_id" : "1",
 "group" : "A",
 "value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "2",
 "group" : "B",
 "value_list" : [10,20,30,40,50,60,70]
},
{"_id" : "3",
 "group" : "A",
 "value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "4",
 "group" : "B",
 "value_list" : [10,20,30,40,50,60,70]
}

So the results I'm after is listed below.

There are two group A documents above and at position 1 of the value_list array, both documents have the value of 1. so 1+1=2. Position 2 the value is 2 in both documents so 2+2=4, etc.

There are two group B documents above and at position 1 of the value_list array, both documents have the value of 10. so 10+10=20. Position 2 the value is 20 in both documents so 20+20=40, etc.

{"_id" : "30",
 "group" : "A",
 "value_list" : [2,4,6,8,10,12,14]
},
{"_id" : "30",
 "group" : "A",
 "value_list" : [20,40,60,80,100,120,140]
}

How would I do this using Mongo Script? Thanks, Matt

1 个答案:

答案 0 :(得分:1)

当然,最“可扩展”的方式是使用$unwindincludeArrayIndex选项来跟踪位置,然后$sum“展开”组合,然后再添加回数组格式:

db.getCollection('test').aggregate([
  { "$unwind": { "path": "$value_list", "includeArrayIndex": "index" } },
  { "$group": {
    "_id": {
      "group": "$group",
      "index": "$index"
    },
    "value_list": { "$sum": "$value_list" }
  }},
  { "$sort": { "_id": 1 } },
  { "$group": {
      "_id": "$_id.group",
      "value_list": { "$push": "$value_list" }
  }},
  { "$sort": { "_id": 1 } }  
])

请注意,在第一个$sort后需要$group才能维持阵列位置。

如果您可以使用它,您也可以将所有数组应用到$reduce

db.getCollection('test').aggregate([
  { "$group": {
    "_id": "$group",
    "value_list": { "$push": "$value_list" }  
  }},
  { "$addFields": {
    "value_list": {
      "$reduce": {
        "input": "$value_list",
        "initialValue": [],
        "in": {
          "$map": {
            "input": {
              "$zip": {
                "inputs": ["$$this", "$$value"],
                "useLongestLength": true,
              }
            },
            "in": { "$sum": "$$this"}
          }
        }         
      } 
    }  
  }},
  { "$sort": { "_id": 1 } }
])

基本上,您使用初始$push创建“数组数组”,并使用$reduce处理该数组。 $zip为每个元素执行“成对”分配,然后使用$map$sum期间在每个位置将它们添加到一起。

虽然效率稍高,但对于大数据来说并不实用,因为在“减少”它之前,通过将所有分组的“数组”添加到分组中的单个数组中,您可能会破坏BSON限制。

任何一种方法都会产生相同的结果:

/* 1 */
{
    "_id" : "A",
    "value_list" : [ 
        2.0, 
        4.0, 
        6.0, 
        8.0, 
        10.0, 
        12.0, 
        14.0
    ]
}

/* 2 */
{
    "_id" : "B",
    "value_list" : [ 
        20.0, 
        40.0, 
        60.0, 
        80.0, 
        100.0, 
        120.0, 
        140.0
    ]
}