我有一组(~35k)文档,如下所示:
{
"_id" : ObjectId("583dabfc7572394f93ac6ef2"),
"updatedAt" : ISODate("2016-11-29T16:25:32.130Z"),
"createdAt" : ISODate("2016-11-29T16:25:32.130Z"),
"sourceType" : "report",
"sourceRef" : ObjectId("583da865686e3dfbd977f059"),
"type" : "video",
"caption" : "lorem ipsum",
"timestamps" : {
"postedAt" : ISODate("2016-08-26T15:09:35.000Z"),
"monthOfYear" : 7, // 0-based
"dayOfWeek" : 5, // 0-based
"hourOfDay" : 16 // 0-based
},
"stats" : {
"comments" : 0,
"likes" : 8
},
"user" : {
"id" : "123456",
"username" : "johndoe",
"fullname" : "John",
"picture" : ""
},
"images" : {
"thumbnail" : "",
"low" : "",
"standard" : ""
},
"mentions" : [
"janedoe"
],
"tags" : [
"holiday",
"party"
],
"__v" : 0
}
我想生成一个汇总报告,该报告将用于按一周中每周/每天的一小时来计算文档的频率,以及提及/标记的计数。
{
// Each frequency is independant from the others,
// e.g. the total count for each frequency should
// be ~35k.
dayFrequency: [
{ day: 0, count: 1400 }, // Monday
{ day: 1, count: 1700 }, // Tuesday
{ day: 2, count: 1800 }, // Wednesday
{ /* etc */ },
{ day: 6, count: 1200 } // Sunday
],
monthFrequency: [
{ month: 0, count: 200 }, // January
{ month: 1, count: 250 }, // February
{ month: 2, count: 300 }, // March
{ /* etc */ },
{ month: 11, count: 150 } // December
],
hourFrequency: [
{ hour: 0, count: 150 }, // 0am
{ hour: 1, count: 200 }, // 1am
{ hour: 2, count: 275 }, // 2am
{ /* etc */ },
{ hour: 23, count: 150 }, // 11pm
],
mentions: {
janedoe: 12,
johnsmith: 11,
peter: 54,
/* and so on */
},
tags: {
holiday: 872,
party: 1029,
/* and so on */
}
}
这是可能的,如果可以的话,我该如何写呢?根据我的理解,当我执行所有匹配文档的汇总时,它实际上是一个组?
到目前为止,我的代码只是将所有匹配的记录分组到一个组中,但我不确定如何继续前进。
Model.aggregate([
{ $match: { sourceType: 'report', sourceRef: '583da865686e3dfbd977f059' } },
{ $group: {
_id: '$sourceRef'
}}
], (err, res) => {
console.log(err);
console.log(res);
})
同样可以接受的是将频率计为一个计数数组(例如[ 1400, 1700, 1800, /* etc */ 1200 ]
),这会让我看看$count
和其他几个运算符,但是我再也不是明确用法。
答案 0 :(得分:1)
目前不可能(在撰写本文时)在单个管道中使用MongoDB 3.2执行此操作。但是,从MongoDB 3.4及更高版本开始,您可以使用 $facet
运算符,该运算符允许在同一组输入文档的单个阶段中处理多个聚合管道。每个子管道在输出文档中都有自己的字段,其结果存储为文档数组。
例如,可以通过运行以下聚合管道来实现上述内容:
Model.aggregate([
{ "$match": { "sourceType": "report", "sourceRef": "583da865686e3dfbd977f059" } },
{
"$facet": {
"dayFrequency": [
{
"$group": {
"_id": "$timestamps.dayOfWeek",
"count": { "$sum": 1 }
}
}
],
"monthFrequency": [
{
"$group": {
"_id": "$timestamps.monthOfYear",
"count": { "$sum": 1 }
}
}
],
"hourFrequency": [
{
"$group": {
"_id": "$timestamps.hourOfDay",
"count": { "$sum": 1 }
}
}
],
"mentions": [
{ "$unwind": "$mentions" },
{
"$group": {
"_id": "$mentions",
"count": { "$sum": 1 }
}
}
],
"tags": [
{ "$unwind": "$tags" },
{
"$group": {
"_id": "$tags",
"count": { "$sum": 1 }
}
}
]
}
}
], (err, res) => {
console.log(err);
console.log(res);
})