Mongodb组平均数组

时间:2017-12-18 13:35:00

标签: python arrays mongodb aggregation-framework pymongo

我正在尝试进行PyMongo聚合 - $ group平均数组,我找不到任何符合我问题的例子。

数据示例

{
    Subject: "Dave",
    Strength: [1,2,3,4]
},
{
    Subject: "Dave",
    Strength: [1,2,3,5]
},
{
    Subject: "Dave",
    Strength: [1,2,3,6]
},
{
    Subject: "Stuart",
    Strength: [4,5,6,7]
},
{
    Subject: "Stuart",
    Strength: [6,5,6,7]
},
{
    Subject: "Kevin",
    Strength: [1,2,3,4]
},
{
    Subject: "Kevin",
    Strength: [9,4,3,4]
}

通缉结果

{
    Subject: "Dave",
    mean_strength = [1,2,3,5]
},
{
    Subject: "Stuart",
    mean_strength = [5,5,6,7]
},
{
    Subject: "Kevin",
    mean_strength = [5,3,3,4]
}

我尝试过这种方法,但是MongoDB将数组解释为Null?

pipe = [{'$group': {'_id': 'Subject', 'mean_strength': {'$avg': '$Strength'}}}]
results = db.Walk.aggregate(pipeline=pipe)

Out: [{'_id': 'SubjectID', 'total': None}]

我查看了MongoDB文档,如果有任何方法可以找到或理解?

3 个答案:

答案 0 :(得分:3)

您可以将$unwindincludeArrayIndex一起使用。顾名思义,includeArrayIndex将数组索引添加到输出中。这允许Subject中的Strength分组和$group中的数组位置。计算平均值后,需要对结果进行排序,以确保第二个$push$project将结果添加回正确的顺序。最后,有db.test.aggregate([{ "$unwind": { "path": "$Strength", "includeArrayIndex": "rownum" } }, { "$group": { "_id": { "Subject": "$Subject", "rownum": "$rownum" }, "mean_strength": { "$avg": "$Strength" } } }, { "$sort": { "_id.Subject": 1, "_id.rownum": 1 } }, { "$group": { "_id": "$_id.Subject", "mean_strength": { "$push": "$mean_strength" } } }, { "$project": { "_id": 0, "Subject": "$_id", "mean_strength": 1 } } ]) 来包含和重命名相关列。

{ "mean_strength" : [ 5, 5, 6, 7 ], "Subject" : "Stuart" }
{ "mean_strength" : [ 5, 3, 3, 4 ], "Subject" : "Kevin" }
{ "mean_strength" : [ 1, 2, 3, 5 ], "Subject" : "Dave" }

对于测试输入,返回:

gulpfile.js

答案 1 :(得分:1)

您可以尝试以下聚合。

例如,Dave在小组赛阶段后有[[1,2,3,4], [1,2,3,5], [1,2,3,6]]

这是矩阵

减少功能

Pass   Current Value (c) Accumulated Value (b)       Next Value
First:   [1,2,3,5]        [[1],[2],[3],[4]]           [[1,1],[2,2],[3,3],[5, 4]]
Second:  [1,2,3,6]        [[1,1],[2,2],[3,3],[5, 4]]  [[1,1,1],[2,2,2],[3,3,3],[5, 4, 6]]

地图功能 - 计算从减少阶段到输出[1,2,3,5]

的每个数组值的平均值
[{"$group":{"_id":"$Subject","Strength":{"$push":"$Strength"}}}, //Push all arrays
 {"$project":{"mean_strength":{
   "$map":{//Calculate avg for each reduced indexed pairs.
     "input":{
       "$reduce":{
         "input":{"$slice":["$Strength",1,{"$subtract":[{"$size":"$Strength"},1]}]}, //Start from second array.
         "initialValue":{ //Initialize to the first array with all elements transformed to array of single values.
           "$map":{
             "input":{"$range":[0,{"$size":{"$arrayElemAt":["$Strength",0]}}]},
             "as":"a",
             "in":[{"$arrayElemAt":[{"$arrayElemAt":["$Strength",0]},"$$a"]}]
           }
         },
         "in":{
           "$let":{"vars":{"c":"$$this","b":"$$value"}, //Create variables for current and accumulated values
             "in":{"$map":{ //Creates map of same indexed values from each iteration 
                 "input":{"$range":[0,{"$size":"$$b"}]},
                 "as":"d",
                 "in":{
                   "$concatArrays":[ //Concat values at same index 
                     {"$arrayElemAt":["$$c","$$d"]}, //current
                     [{"$arrayElemAt":["$$b","$$d"]}] //accumulated
                  ]
                 }
               }
             }
           }
         }
       }
     },
    "as":"e",
    "in":{"$avg":"$$e"}
   }
 }}}
]

答案 2 :(得分:0)

根据上述问题的描述,作为解决方案,请尝试执行以下聚合查询

db.collection.aggregate(

  // Pipeline
  [
    // Stage 1
    {
      $unwind: { path: "$Strength", includeArrayIndex: "arrayIndex" }   
    },

    // Stage 2
    {
      $group: {
        _id:{Subject:'$Subject',arrayIndex:'$arrayIndex'},
        mean_strength:{$avg:'$Strength'}
      }
    },

    // Stage 3
    {
      $group: {
      _id:{'Subject':'$_id.Subject'},
      mean_strength:{$push:'$mean_strength'}
      }
    },

    // Stage 4
    {
      $project: {
      Subject:'$_id.Subject',
      mean_strength:'$mean_strength',
      _id:0
      }
    }

  ]


);