首先,在mongodb聚合中将bucketauto设置为第二

时间:2019-10-31 18:24:16

标签: mongodb aggregation-framework

我有一个这样的数据集:

 {
          "id": 1230239,
          "group_name": "A",
          "confidence": 0.14333882876354542,

  },
  {
          "id": 1230240,
          "group_name": "B",
          "confidence": 0.4434535,

   },

等等。

使用$bucketauto这样来计算每个置信度级别的存储桶和项目数非常简单:

{
  "$bucketAuto": {
     "groupBy": "$confidence",
     "buckets": 4
 }
}

但是我如何分别为每个小组做同样的事情?

我尝试了这个:

    {"$group": {
                    "_id": "group",
                    "data": {
                        "$push": {
                            "confidence": "$confidence",
                        }
                    }
                }
                },
                {
                    "$bucketAuto": {
                        "groupBy": "$data.confidence",
                        "buckets": 4
                    }
                }

但这不起作用。

我大致需要的是作为输出:

{ 'groupA': 
     {
            "_id": {
                "min": 0.0005225352581638143,
                "max": 0.2905137273072962
            },
            "count": 67
        },
        {"_id": {
                "min": 0.2905137273072962,
                "max":0.5531611756507283,
            },
            "count": 43
        },
}, 
{ 'groupB': 
     {
       "_id": {
                "min": 0.0005225352581638143,
                "max": 0.2905137273072962
            },
            "count": 67
        },
        {"_id": {
                "min": 0.2905137273072962,
                "max":0.5531611756507283,
            },
            "count": 43
        },
}

任何建议或提示将不胜感激

1 个答案:

答案 0 :(得分:1)

db.foo.aggregate([ {$facet: { "groupA": [ {$match: {"group_name": "A"}} ,{$bucketAuto: { "groupBy": "$confidence", "buckets": 3 }} ] ,"groupB": [ {$match: {"group_name": "B"}} ,{$bucketAuto: { "groupBy": "$confidence", "buckets": 3 }} ] }} ]); 进行救援-“多组”运算符。该管道:

{
    "groupA" : [
        {
            "_id" : {
                "min" : 0.14333882876354542,
                "max" : 0.34333882876354543
            },
            "count" : 2
        },
        {
            "_id" : {
                "min" : 0.34333882876354543,
                "max" : 0.5433388287635454
            },
            "count" : 2
        },
        {
            "_id" : {
                "min" : 0.5433388287635454,
                "max" : 0.5433388287635454
            },
            "count" : 1
        }
    ],
    "groupB" : [
        {
            "_id" : {
                "min" : 0.5433388287635454,
                "max" : 0.7433388287635454
    // etc. etc. 

产生您想要的输出:

$facet

如果要完全动态化,则需要分两步进行:首先获取不同的组名,然后从这些名称构建db.foo.distinct("group_name").forEach(function(name) { fct_stage["group" + name] = [ {$match: {"group_name": name}} ,{$bucketAuto: { "groupBy": "$confidence", "buckets": 3 }} ]; }); db.foo.aggregate([ {$facet: fct_stage} ]); 表达式:

setInterval