Mongodb每两小时聚集一次

时间:2016-08-03 04:24:30

标签: mongodb mongodb-query aggregation-framework

我有以下表格的汇总查询

db.mycollection.aggregate([  
  {
    "$match": 
    {
      "Time": { $gte: ISODate("2016-01-30T00:00:00.000+0000") }
    }
  },  
  { 
    "$group": 
    {
      "_id": 
      {  
        "day": { "$dayOfYear": "$Time" },  
        "hour": { "$hour": "$Time" } 
      },  
      "Dishes": { "$addToSet": "$Dish" }  
    }
  },  
  { 
    "$group": 
    {  
      "_id": "$_id.hour",  
      "Food": 
      {   
        "$push": 
        {  
          "Day": "$_id.day",  
          "NumberOfDishes": { "$size":"$Dishes" }  
        }  
      }  
    }
  },  
  {
    "$project":
      {
        "Hour": "$_id",
        "Food": "$Food",
        "_id" : 0
      }
  },  
  { 
    "$sort": { "Hour": 1 }
  }  
]);

而不是在一小时的持续时间内如上所述,例如0-1,1-2,2-3,3-4,4-5,...,23-24,我希望能够在两小时的时间内完成这项工作。例如0-2,2-4,4-6,...,22-24。有没有办法做到这一点?

1 个答案:

答案 0 :(得分:4)

提示:在arithmetic aggregation operators

中使用$project

让我们说ClientValidationFunction="JSValidateFunctionName",其中import org.apache.spark.sql.funtions._ odl_df.join(new_df, "src") .withColumn("finalRank", when(new_df("rank").isNull, odl_df("rank")) .otherwise(new_df("rank")) .drop(new_df("rank")) .drop(odl_df("rank")) .withColumnRenamed("finalRank", "rank") 是文件日期的实际小时数。然后,您可以H=floor(hour/2)$floor运营商应用hour

H

此处"H": { $floor: { $divide: [ { "$hour": "$Time" }, 2 ] } } 对应于一对小时(HHours=[0,2) => H=0Hours=[2,4) => H=1等),您可以将其传递到$divide阶段与

Hours=[22,24) => H=11

然后,您可以使用

输出特定$group: { "_id": { "day": { $dayOfYear: "$Time" }, "H": "$H" } } 的小时数
H

鉴于文件集合

"Hours": [ { $multiply: [ "$H", 2 ] }, { $sum: [ { $multiply: [ "$H", 2 ] }, 2 ] } ]

并使用下一个聚合

{ "Time" : ISODate("2016-01-30T01:00:00Z"), "Dish" : "dish1" }
{ "Time" : ISODate("2016-01-30T02:00:00Z"), "Dish" : "dish2" }
{ "Time" : ISODate("2016-01-30T03:00:00Z"), "Dish" : "dish3" }
{ "Time" : ISODate("2016-01-30T04:00:00Z"), "Dish" : "dish4" }
{ "Time" : ISODate("2016-01-30T05:00:00Z"), "Dish" : "dish5" }
{ "Time" : ISODate("2016-01-30T06:00:00Z"), "Dish" : "dish6" }
{ "Time" : ISODate("2016-01-30T07:00:00Z"), "Dish" : "dish7" }
{ "Time" : ISODate("2016-01-30T08:00:00Z"), "Dish" : "dish8" }
{ "Time" : ISODate("2016-01-30T09:00:00Z"), "Dish" : "dish9" }

提供结果

db.mycollection.aggregate([  
  {
    "$match": 
    {
      "Time": { $gte: ISODate("2016-01-30T00:00:00.000+0000") }
    }
  },
  {
    "$project":
    {
      "Dish": 1,
      "Time": 1,
      "H": { $floor: { $divide: [ { $hour: "$Time" }, 2 ] } }
    }
  },
  { 
    "$group": 
    {
      "_id": 
      {  
        "day": { $dayOfYear: "$Time" },  
        "H": "$H" 
      },
      "Dishes": { $addToSet: "$Dish" }  
    }
  },  
  { 
    "$group": 
    {  
      "_id": "$_id.H",  
      "Food": 
      {   
        "$push": 
        {  
          "Day": "$_id.day",  
          "NumberOfDishes": { $size: "$Dishes" }  
        }  
      }  
    }
  },
  { 
    "$sort": { "_id": 1 }
  },
  {
    "$project":
      {
        "Hours": [ { $multiply: [ "$_id", 2 ] }, { $sum: [ { $multiply: [ "$_id", 2 ] }, 2 ] } ],
        "Food": "$Food",
        "_id": 0
      }
  }
]);