MongoDB按项目聚合分组,然后按周累计

时间:2016-03-08 16:28:47

标签: mongodb mongodb-query aggregation-framework

我有这个输入结构:

{
    "_id" : ObjectId("56de0178cf7970ac2a86fb23"),
    "createdAt" : ISODate("2016-03-07T16:32:24.681-06:00"),
    "updatedAt" : ISODate("2016-03-07T16:32:24.681-06:00"),
    "yearTask" : 2016,
    "startWeek" : 10,
    "task" : "31231321",
    "hours" : 312,
    "project" : [ 
        {
            "Project" : "1000G",
            "_id" : "565f193cea6493ce0acc9730"
        }
    ],
    "plannedWeeks" : [ 
        {
            "yearTask" : 2016,
            "hours" : 3,
            "weekNumber" : 10
        }, 
        {
            "yearTask" : 2016,
            "hours" : 3,
            "weekNumber" : 11
        }, 
        {
            "yearTask" : 2016,
            "hours" : 3,
            "weekNumber" : 12
        }, 
        {
            "yearTask" : 2016,
            "hours" : 3,
            "weekNumber" : 13
        }, 
        {
            "yearTask" : 2016,
            "hours" : 3,
            "weekNumber" : 14
        }
    ],
}

所以想象我有其他条目,我需要每周的总小时数(weekNumber),而且我需要按项目分配这个信息组(在这种情况下,项目的名称是#34;项目& #34)。周数是可变的。 项目字段是一个数组,但只包含一个项目。

输出如下:

{
   _id : {
           "name" : "1000G",
            "yearTask" : 2016,
            "weeks" : [ 
                    {
                        "yearTask" : 2016,
                        "hours" : 34, <--Total sum for this project and week
                        "weekNumber" : 10
                    }

                 .... etc.

             ]

        },
   _id : {
           "name" : "Project2",
            "yearTask" : 2016,
            "weeks" : [ 
                    {
                        "yearTask" : 2016,
                        "hours" : 584,<--Total sum for this project and week
                        "weekNumber" : 10
                    }

                 .... etc.

             ]

        } 

}

我当前的查询仅按计划对计划周数进行分组:

db.tasks.aggregate(
   [
        { "$unwind": "$project" },
        {$group : {
           _id : { 
               name : "$project.Project", 
               yearTask : "$yearTask",  
               weeks : "$plannedWeeks",

            },
            /*"matches" : { "$sum" : "$plannedWeeks.hours" },*/
        }},
        { $match : { "_id.yearTask": { $eq: 2016 } } },

   ]
)

我尝试使用{ "$unwind": "$plannedWeeks" },但我不知道如何将每周的总和相加,然后按项目分组

已编辑 - 我的解决方案是:

   [
    { "$match" : { "yearTask": 2016 } },
    { "$unwind": "$project" },
    { "$unwind": "$plannedWeeks" },
    /*{ "$match" : { "yearTask": 2016 } },*/
    {
        "$group": {
            "_id": {
                "name": "$project.Project",
                /*"yearTask": "$plannedWeeks.yearTask",*/
                "weekYear": "$plannedWeeks.yearTask",
                "weekNumber": "$plannedWeeks.weekNumber"
            },
            "weeks": {
                "$push": {
                    "yearTask": "$plannedWeeks.yearTask",                   
                    "weekNumber": "$plannedWeeks.weekNumber"
                }
            },
            "hours": { "$sum": "$plannedWeeks.hours" },            
        }
    },
    { $sort : { "_id.weekYear" : 1,"_id.weekNumber" : 1, } },
    { "$group": {
        "_id": {
            "name": "$_id.name",
            /*"yearTask": "$_id.yearTask",*/
        },
        "weeks": {
            "$push": {
                 "yearTask": "$_id.weekYear",
                 "hours": "$hours",
                 "weekNumber": "$_id.weekNumber"
            }
        }
    }},


] 

1 个答案:

答案 0 :(得分:1)

您希望“两个”$group阶段首先按“周”计算,然后$push将结果添加到每个阶段的累计密钥中。

理想情况下使用MongoDB 3.2中的$arrayElemAt

db.tasks.aggregate([
    { "$unwind": "$plannedWeeks" },
    { "$group": {
        "_id": {
            "name": { "$arrayElemAt": [ "$project.Project", 0 ] },
            "yearTask": "$yearTask",
            "weekNumber": "$plannedWeeks.weekNumber"
        },
        "hours": { "$sum": "$plannedWeeks.hours" }
    }},
    { "$group": {
        "_id": {
            "name": "$_id.name",
            "yearTask": "$_id.yearTask",
        },
        "weeks": {
            "$push": {
                 "yearTask": "$_id.yearTask",
                 "hours": "$hours",
                 "weekNumber": "$_id.weekNumber"
            }
        }
    }}
])

当然,由于"project"只是一个项目的数组,因此在早期版本中使用$unwind也没有问题

db.tasks.aggregate([
    { "$unwind": "$plannedWeeks" },
    { "$unwind": "$project" },
    { "$group": {
        "_id": {
            "name": "$project.Project",
            "yearTask": "$yearTask",
            "weekNumber": "$plannedWeeks.weekNumber"
        },
        "hours": { "$sum": "$plannedWeeks.hours" }
    }},
    { "$group": {
        "_id": {
            "name": "$_id.name",
            "yearTask": "$_id.yearTask",
        },
        "weeks": {
            "$push": {
                 "yearTask": "$_id.yearTask",
                 "hours": "$hours",
                 "weekNumber": "$_id.weekNumber"
            }
        }
    }}
])

无论如何,它是两个$group阶段,第一阶段是总和而下一阶段是创建数组。

如果它只包含一个元素,那么重新考虑"project"数组的使用可能是个好主意。如果您期望包含的数据之间存在某种相关性,那么文档中的多个数组可能会导致问题,而这通常在单个数组中更好地表示,或者仅作为基本属性,甚至是嵌套。

与往常一样,$match首先在聚合管道中,如果您确实打算按结果中的条件过滤文档内容。