mongodb聚合 - 展开/组/项目查询组合

时间:2016-10-22 21:38:04

标签: mongodb mongodb-query aggregation-framework mongodb-aggregation

我在以下格式的集合中有记录。

//One parent record
{
    "_id" : "someDocID",
    "title" : "some title",
    "analytics" : [
            {
                    "_id" : "analyticsID1", 
                   "timeSpent" : [
                            {
                                    "time" : 14,
                                    "pageNo" : 1
                            },
                            {
                                    "time" : 4,
                                    "pageNo" : 2
                            },
                            {
                                    "time" : 3,
                                    "pageNo" : 1
                            },
                            {
                                    "time" : 1,
                                    "pageNo" : 2
                            }
                    ]                       

            },
            {                        
                    "_id" : "analyticsID2",                        
                    "timeSpent" : [
                            {
                                    "time" : 12,
                                    "pageNo" : 10
                            },
                            {
                                    "time" : 15,
                                    "pageNo" : 11
                            },
                            {
                                    "time" : 26,
                                    "pageNo" : 12
                            },
                            {
                                    "time" : 13,
                                    "pageNo" : 11
                            },
                            {
                                    "time" : 17,
                                    "pageNo" : 10
                            },
                            {
                                    "time" : 30,
                                    "pageNo" : 11
                            }
                    ]
            }
    ]               
}

" pageNo"字段包含重复值。我需要将pageNo字段分组,并添加各自的" time"。

这是我要求的输出。 (" $ unwind"分析操作)

//Two records after "$unwind" on analytics
{
    "_id" : "someDocID",
    "title" : "some title",
    "analytics" : {
                    "_id" : "analyticsID1", 
                    "timeSpent" : [
                            {
                                    "time" : 17,   //14+3
                                    "pageNo" : 1
                            },
                            {
                                    "time" : 5,    //4+1
                                    "pageNo" : 2
                            }
                    ]
            }
}

{
    "_id" : "someDocID",
    "title" : "some title",
    "analytics" : {
                    "_id" : "analyticsID2", 
                    "timeSpent" : [
                            {
                                    "time" : 29,    //12+17
                                    "pageNo" : 10
                            },
                            {
                                    "time" : 58,    //15+13+30
                                    "pageNo" : 11
                            },
                            {
                                    "time" : 26,
                                    "pageNo" : 12
                            }                                
                    ]      
            }
}

我已尝试过聚合,群组,展开和项目的各种组合,但仍然无法实现这一目标,并且非常感谢任何建议。

1 个答案:

答案 0 :(得分:0)

以下是我提供的汇总信息,用于提供您在上述评论中提到的输出。作为一个FYI,您需要解开的阵列中的元素越多,您将拥有的内存使用量就越多,并且基于数组大小需要指数级的时间。如果您的阵列长度不受限制,我强烈建议您以不同方式构建数据。

var aggregrate = [{
    $unwind: '$analytics'
}, {
    $unwind: '$analytics.timeSpent'
}, {
    $group: {
        _id: {
            analytics_id: '$analytics._id',
            pageNo: '$analytics.timeSpent.pageNo'
        },
        title:{$first:'$title'},
        time: {
            $sum: '$analytics.timeSpent.time'
        },
    }
}, {
    $group: {
        _id: '$_id.analytics_id',
        title:{$first:'$title'},
        timeSpent: {
            $push: {
                time: '$time',
                pageNo: '$_id.pageNo'
            }
        }
    }
}, ];

此输出:

[{
    "_id": "analyticsID1",
    "title" : "some title", 
    "timeSpent": [{
        "time": NumberInt(17),
        "pageNo": NumberInt(1)
    }, {
        "time": NumberInt(5),
        "pageNo": NumberInt(2)
    }]
}, {
    "_id": "analyticsID2",
     "title" : "some title", 
     "timeSpent": [{
        "time": NumberInt(26),
        "pageNo": NumberInt(12)
    }, {
        "time": NumberInt(29),
        "pageNo": NumberInt(10)
    }, {
        "time": NumberInt(58),
        "pageNo": NumberInt(11)
    }]
}]