MongoDB聚合AVG()

时间:2016-05-31 08:39:13

标签: mongodb mongodb-query aggregation-framework

嘿伙计们,如果涉及聚合,我真的是一个新人,所以请帮助我解决这个问题。

我们说我有多个文件(随着时间的推移):

{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040691),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 3.4
    },
    {
      "name": "cpu2",
      "load": 0.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040700),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 0.4
    },
    {
      "name": "cpu2",
      "load": 6.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460041000),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 25.4
    },
    {
      "name": "cpu2",
      "load": 1.7
    }
  ]
}

我想在X时间内获得平均cpu负载。其中X等于300秒。

因此,通过上面的示例,我们将获得一个如下所示的结果集:

{
    "avgCPULoad": "2.8",
    "timestamp": NumberLong(1460040700)
},
{
    "avgCPULoad": "13.55",
    "timestamp": NumberLong(1460041000)
}

avgCpuLoad的计算如下:

  1. 在彼此的300秒内抓取所有文件
  2. 计算平均值:
    1. (((3.4+0.7)/2)+((0.4+6.7)/2))/2 = 2.8
    2. ((25.4+1.7)/2) = 13.55
  3. 从所选文档添加上次时间戳。
  4. 我知道我每隔x次获取每个文档的方式。这就是这样做的:

    db.Pizza.aggregate(
    [
        {
            $group: 
            {
                _id:
                {
                    $subtract: [
                        '$timestamp',   
                        {
                            $mod: ['$timestamp', 300]
                        }
                    ]
                },
                'timestamp': {$last:'$timestamp'}
            },
        {
            $project: {_id: 0, timestamp:'$timestamp'}
        }
    ])
    

    但是如何获得如上所述的平均值呢? 我已经尝试了$unwind,但没有给出我喜欢的结果..

2 个答案:

答案 0 :(得分:1)

解决方法是在数组(cpulist)上使用unwind。我为你做了一个例子查询:

db.CpuInfo.aggregate([
    {
        $unwind: '$cpuList'
    },
    {
        $group: {
            _id:{
                $subtract:[
                    '$timestamp', 
                    {$mod: ['$timestamp', 300]}
                ]
            }, 
            'timestamp':{$last:'$timestamp'},
            'cpuList':{$avg:'$cpuList.load'}
        }
    }
])

答案 1 :(得分:1)

您需要运行以下聚合操作才能获得所需的结果:

db.collection.aggregate([ 
    { "$unwind": "$cpuList" },
    {
        "$group": {
            "_id": {                
                "interval": {
                    "$subtract": [ 
                        "$timestamp",                        
                        { "$mod": [ "$timestamp", 60 * 5 ] }
                    ]
                }
            },             
            "avgCPULoad": { "$avg": "$cpuList.load" },
            "timestamp": { "$max": "$timestamp" }
        } 
    },
    {
        "$project": { "_id": 0, "avgCPULoad": 1, "timestamp": 1 }
    }
])

以上将展平的文件分组5分钟(以秒为单位);通过从实际时间戳除以5分钟间隔(以秒为单位)得到的余数中减去时间戳(以秒为单位)推导出间隔键

示例输出

/* 1 */
{
    "avgCPULoad" : 13.55,
    "timestamp" : NumberLong(1460041000)
}

/* 2 */
{
    "avgCPULoad" : 2.8,
    "timestamp" : NumberLong(1460040700)
}