所以我有一组数据,其中包含与之关联的时间戳。我想让mongo在3分钟的时间戳内聚合那些有重复的东西。我会告诉你一个我的意思的例子:
原始数据:
[{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:47:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z"}]
查询后,它将是:
[{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z"}]
因为第二个条目位于第一个条目创建的3分钟气泡内。我已经获得了代码,以便聚合并删除具有相同水果的dupes但现在我只想组合时间戳泡沫中的那些。
答案 0 :(得分:1)
我们应该能够做到这一点!首先让我们在3分钟的“泡沫”中分成一小时:
[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57]
现在要对这些文档进行分组,我们需要稍微修改时间戳。据我所知,聚合框架目前无法实现,因此我将使用group()
方法。
为了在同一时间段内对水果进行分组,我们需要将时间戳设置为最近的分钟“气泡”。我们可以使用timestamp.minutes -= (timestamp.minutes % 3)
执行此操作。
以下是生成的查询:
db.collection.group({
keyf: function (doc) {
var timestamp = new ISODate(doc.timestamp);
// seconds must be equal across a 'bubble'
timestamp.setUTCSeconds(0);
// round down to the nearest 3 minute 'bubble'
var remainder = timestamp.getUTCMinutes() % 3;
var bubbleMinute = timestamp.getUTCMinutes() - remainder;
timestamp.setUTCMinutes(bubbleMinute);
return { fruit: doc.fruit, 'timestamp': timestamp };
},
reduce: function (curr, result) {
result.sum += 1;
},
initial: {
sum : 0
}
});
示例结果:
[
{
"fruit" : "apple",
"timestamp" : ISODate("2014-07-17T06:45:00Z"),
"sum" : 2
},
{
"fruit" : "apple",
"timestamp" : ISODate("2014-07-17T06:54:00Z"),
"sum" : 1
},
{
"fruit" : "banana",
"timestamp" : ISODate("2014-07-17T09:03:00Z"),
"sum" : 1
},
{
"fruit" : "orange",
"timestamp" : ISODate("2014-07-17T14:24:00Z"),
"sum" : 2
}
]
为了简化这一过程,您可以预先计算“气泡”时间戳,并将其作为单独的字段插入到文档中。您创建的文档如下所示:
[
{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z", "bubble": "2014-07-17T06:45:00Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:47:18Z", "bubble": "2014-07-17T06:45:00Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z", "bubble": "2014-07-17T06:54:00Z"}
]
当然这会占用更多存储空间。但是,使用此文档结构,您可以使用聚合函数[0]。
db.collection.aggregate(
[
{ $group: { _id: { fruit: "$fruit", bubble: "$bubble"} , sum: { $sum: 1 } } },
]
)
希望有所帮助!
[0] MongoDB aggregation comparison: group(), $group and MapReduce