我有一个这样的数据集:
{
"id": 1230239,
"group_name": "A",
"confidence": 0.14333882876354542,
},
{
"id": 1230240,
"group_name": "B",
"confidence": 0.4434535,
},
等等。
使用$bucketauto
这样来计算每个置信度级别的存储桶和项目数非常简单:
{
"$bucketAuto": {
"groupBy": "$confidence",
"buckets": 4
}
}
但是我如何分别为每个小组做同样的事情?
我尝试了这个:
{"$group": {
"_id": "group",
"data": {
"$push": {
"confidence": "$confidence",
}
}
}
},
{
"$bucketAuto": {
"groupBy": "$data.confidence",
"buckets": 4
}
}
但这不起作用。
我大致需要的是作为输出:
{ 'groupA':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
},
{ 'groupB':
{
"_id": {
"min": 0.0005225352581638143,
"max": 0.2905137273072962
},
"count": 67
},
{"_id": {
"min": 0.2905137273072962,
"max":0.5531611756507283,
},
"count": 43
},
}
任何建议或提示将不胜感激
答案 0 :(得分:1)
db.foo.aggregate([
{$facet: {
"groupA": [
{$match: {"group_name": "A"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
,"groupB": [
{$match: {"group_name": "B"}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
]
}}
]);
进行救援-“多组”运算符。该管道:
{
"groupA" : [
{
"_id" : {
"min" : 0.14333882876354542,
"max" : 0.34333882876354543
},
"count" : 2
},
{
"_id" : {
"min" : 0.34333882876354543,
"max" : 0.5433388287635454
},
"count" : 2
},
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.5433388287635454
},
"count" : 1
}
],
"groupB" : [
{
"_id" : {
"min" : 0.5433388287635454,
"max" : 0.7433388287635454
// etc. etc.
产生您想要的输出:
$facet
如果要完全动态化,则需要分两步进行:首先获取不同的组名,然后从这些名称构建db.foo.distinct("group_name").forEach(function(name) {
fct_stage["group" + name] = [
{$match: {"group_name": name}}
,{$bucketAuto: {
"groupBy": "$confidence",
"buckets": 3
}}
];
});
db.foo.aggregate([ {$facet: fct_stage} ]);
表达式:
setInterval