注意:我在输出中只提供了一些文档,以保持帖子小而直观
源集合:
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 15
},
"PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 16
},
"PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Channel_Id" : 2,
"TweetId" : 17
},
"PostDate" : ISODate("2013-10-30T18:30:00Z")
}
第1步:按照PostDate进行分组
查询:
db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}})
输出
{
"result" : [
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"day" : 31,
"month" : 10,
"year" : 2013
},
"avgSentiment" : 2.2700000000000005
},
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"day" : 30,
"month" : 10,
"year" : 2013
},
"avgSentiment" : 4.96
}
}
第2步:尝试实现此目标:
{
"result" : [
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Date" : ISODate("2013-10-31T18:30:00Z")
},
"avgSentiment" : 2.2700000000000005
},
{
"_id" : {
"SpId" : 840,
"Scheduler_Id" : 1,
"Date" : ISODate("2013-10-31T18:30:00Z")
},
"avgSentiment" : 4.96
}
}
我尝试的查询:
db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}}, {$project : {_id : {SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id", date : new Date("$_id.year","$_id.month","$_id.day")}, avgSentiment : "$avgSentiment"}})
输出(错误):
Error: Printing Stack Trace
at printStackTrace (src/mongo/shell/utils.js:37:15)
at DBCollection.aggregate (src/mongo/shell/collection.js:897:9)
at (shell):1:22
Tue Dec 31 09:41:42.916 JavaScript execution failed: aggregate failed: {
"errmsg" : "exception: disallowed field type Date in object expression (
at 'date')",
"code" : 15992,
"ok" : 0
} at src/mongo/shell/collection.js:L898
如何实现Step-2?
答案 0 :(得分:3)
正如您所注意到的,聚合框架(在MongoDB 2.4中)具有要提取的运算符 部分日期但不能轻易创建日期字段。
Stupid date tricks with Aggregation Framework上有一篇很棒的博文,提供了一种创造性的解决方法:在$project
之前使用$group
截断日期粒度:
db.Twitter_Processed.aggregate(
// Match (can take advantage of suitable index)
{ $match : {
"_id.SpId" : 840,
"_id.Scheduler_Id" : 1
}},
// Extract h/m/s/ms values from PostDate for rounding
{ $project: {
SpId : "$_id.SpId",
Scheduler_Id : "$_id.Scheduler_Id",
PostDate : "$PostDate",
h : { "$hour" : "$PostDate" },
m : { "$minute" : "$PostDate" },
s : { "$second" : "$PostDate" },
ms : { "$millisecond" : "$PostDate" },
senti : "$Sentiment"
}},
// Subtract the h/m/s/ms values to round the date off to yyyy-mm-dd
{ $project: {
SpId : "$_id.SpId",
Scheduler_Id : "$_id.Scheduler_Id",
// PostDate will end up truncated to yyyy-mm-dd granularity
PostDate: {
"$subtract" : [
"$PostDate",
{
"$add" : [
"$ms",
{ "$multiply" : [ "$s", 1000 ] },
{ "$multiply" : [ "$m", 60, 1000 ] },
{ "$multiply" : [ "$h", 60, 60, 1000 ]}
]
}
]
},
senti: "$Sentiment"
}},
{ $group : {
_id : {
SpId : "$SpId",
Scheduler_Id : "$Scheduler_Id",
PostDate: "$PostDate"
},
sentiment : { $sum : "$senti"}
}},
{ $group : {
_id : "$_id" ,
avgSentiment : {$avg : "$sentiment"}
}}
)