我正在尝试进行PyMongo聚合 - $ group平均数组,我找不到任何符合我问题的例子。
{
Subject: "Dave",
Strength: [1,2,3,4]
},
{
Subject: "Dave",
Strength: [1,2,3,5]
},
{
Subject: "Dave",
Strength: [1,2,3,6]
},
{
Subject: "Stuart",
Strength: [4,5,6,7]
},
{
Subject: "Stuart",
Strength: [6,5,6,7]
},
{
Subject: "Kevin",
Strength: [1,2,3,4]
},
{
Subject: "Kevin",
Strength: [9,4,3,4]
}
{
Subject: "Dave",
mean_strength = [1,2,3,5]
},
{
Subject: "Stuart",
mean_strength = [5,5,6,7]
},
{
Subject: "Kevin",
mean_strength = [5,3,3,4]
}
我尝试过这种方法,但是MongoDB将数组解释为Null?
pipe = [{'$group': {'_id': 'Subject', 'mean_strength': {'$avg': '$Strength'}}}]
results = db.Walk.aggregate(pipeline=pipe)
Out: [{'_id': 'SubjectID', 'total': None}]
我查看了MongoDB文档,如果有任何方法可以找到或理解?
答案 0 :(得分:3)
您可以将$unwind
与includeArrayIndex
一起使用。顾名思义,includeArrayIndex
将数组索引添加到输出中。这允许Subject
中的Strength
分组和$group
中的数组位置。计算平均值后,需要对结果进行排序,以确保第二个$push
和$project
将结果添加回正确的顺序。最后,有db.test.aggregate([{
"$unwind": {
"path": "$Strength",
"includeArrayIndex": "rownum"
}
},
{
"$group": {
"_id": {
"Subject": "$Subject",
"rownum": "$rownum"
},
"mean_strength": {
"$avg": "$Strength"
}
}
},
{
"$sort": {
"_id.Subject": 1,
"_id.rownum": 1
}
},
{
"$group": {
"_id": "$_id.Subject",
"mean_strength": {
"$push": "$mean_strength"
}
}
},
{
"$project": {
"_id": 0,
"Subject": "$_id",
"mean_strength": 1
}
}
])
来包含和重命名相关列。
{ "mean_strength" : [ 5, 5, 6, 7 ], "Subject" : "Stuart" }
{ "mean_strength" : [ 5, 3, 3, 4 ], "Subject" : "Kevin" }
{ "mean_strength" : [ 1, 2, 3, 5 ], "Subject" : "Dave" }
对于测试输入,返回:
gulpfile.js
答案 1 :(得分:1)
您可以尝试以下聚合。
例如,Dave在小组赛阶段后有[[1,2,3,4], [1,2,3,5], [1,2,3,6]]
。
这是矩阵
减少功能
Pass Current Value (c) Accumulated Value (b) Next Value
First: [1,2,3,5] [[1],[2],[3],[4]] [[1,1],[2,2],[3,3],[5, 4]]
Second: [1,2,3,6] [[1,1],[2,2],[3,3],[5, 4]] [[1,1,1],[2,2,2],[3,3,3],[5, 4, 6]]
地图功能 - 计算从减少阶段到输出[1,2,3,5]
[{"$group":{"_id":"$Subject","Strength":{"$push":"$Strength"}}}, //Push all arrays
{"$project":{"mean_strength":{
"$map":{//Calculate avg for each reduced indexed pairs.
"input":{
"$reduce":{
"input":{"$slice":["$Strength",1,{"$subtract":[{"$size":"$Strength"},1]}]}, //Start from second array.
"initialValue":{ //Initialize to the first array with all elements transformed to array of single values.
"$map":{
"input":{"$range":[0,{"$size":{"$arrayElemAt":["$Strength",0]}}]},
"as":"a",
"in":[{"$arrayElemAt":[{"$arrayElemAt":["$Strength",0]},"$$a"]}]
}
},
"in":{
"$let":{"vars":{"c":"$$this","b":"$$value"}, //Create variables for current and accumulated values
"in":{"$map":{ //Creates map of same indexed values from each iteration
"input":{"$range":[0,{"$size":"$$b"}]},
"as":"d",
"in":{
"$concatArrays":[ //Concat values at same index
{"$arrayElemAt":["$$c","$$d"]}, //current
[{"$arrayElemAt":["$$b","$$d"]}] //accumulated
]
}
}
}
}
}
}
},
"as":"e",
"in":{"$avg":"$$e"}
}
}}}
]
答案 2 :(得分:0)
根据上述问题的描述,作为解决方案,请尝试执行以下聚合查询
db.collection.aggregate(
// Pipeline
[
// Stage 1
{
$unwind: { path: "$Strength", includeArrayIndex: "arrayIndex" }
},
// Stage 2
{
$group: {
_id:{Subject:'$Subject',arrayIndex:'$arrayIndex'},
mean_strength:{$avg:'$Strength'}
}
},
// Stage 3
{
$group: {
_id:{'Subject':'$_id.Subject'},
mean_strength:{$push:'$mean_strength'}
}
},
// Stage 4
{
$project: {
Subject:'$_id.Subject',
mean_strength:'$mean_strength',
_id:0
}
}
]
);