我在mongodb中有一个如下集合:
{
"_id" : ObjectId("54901212f315dce7077204af"),
"Date" : ISODate("2014-10-20T04:00:00.000Z"),
"Type" : "Twitter",
"Entities" : [
{
"ID" : 2,
"Name" : "test1",
"Sentiment" : {
"Value" : 20,
"Neutral" : 1
},
{
"ID" : 1,
"Name" : "test1",
"Sentiment" : {
"Value" : 1,
"Neutral" : 1
}
},
{
"ID" : 3,
"Name" : "test1",
"Sentiment" : {
"Value" : 2,
"Neutral" : 1
}
]
}
我有几个,例如在2014-10-20日期你可能会发现5条推文,每条推文都有不同的情绪价值,现在我想做的是按日期分组,然后得到总和每个日期的情绪值乘以每个日期的集合数量,例如,如果我们在2014-10-20中有2个集合,情绪值为20,1,2,如上面显示的集合,而另一个集合仅为5 2014-10-20的值是(20 + 1 + 2 + 5) 3(因为此推文重复3个推荐) 2(因为我们在此日期有2条推文文件)= 168 ,如果我不考虑集合的频率,我的代码可以很好地工作如下:
DBObject unwind = new BasicDBObject("$unwind", "$Entities"); // "$unwind" converts object with array into many duplicate objects, each with one from array
collectionG = db.getCollection("GraphDataCollection");
DBObject groupFields = new BasicDBObject( "_id", "$Date");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
DBObject sort = new BasicDBObject("$sort", new BasicDBObject("Date", 1));
stages.add(unwind);
stages.add(groupBy);
DBObject project = new BasicDBObject("_id",0);
project.put("Date","$_id");
project.put("value",1);
stages.add(new BasicDBObject("$project",project));
stages.add(sort);
AggregationOutput output = collectionG.aggregate(stages);
现在结果例如2014-10-20返回28,但我想要168 谁能帮我 ?
更新:我使用的代码的最后一个版本如下:
DBCollection collectionG;
collectionG = db.getCollection("GraphDataCollection");
List<DBObject> stages = new ArrayList<DBObject>();
ArrayList<DBObject> andArray = null;
DBObject groupFields = new BasicDBObject( "_id", "$_id");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
groupFields.put("date", new BasicDBObject( "$first", "$Date"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
stages.add(groupBy);
DBObject groupByDate = new BasicDBObject( "_id", "$date");
groupByDate.put("value",new BasicDBObject("$sum","$value"));
groupByDate.put("count",new BasicDBObject("$sum",1));
DBObject dtGrp = new BasicDBObject("$group", groupByDate );
stages.add(dtGrp);
DBObject project = new BasicDBObject("_id",1);
project.put("value",new BasicDBObject("$multiply",
new Object[]{"$value","$count"}));
stages.add(new BasicDBObject("$project",project));
AggregationOutput output = collectionG.aggregate(stages);
System.out.println(output.results());
答案 0 :(得分:2)
Unwind
实体:
DBObject unwind = new BasicDBObject("$unwind", "$Entities");
stages.add(unwind);
<Group
_id
DBObject groupFields = new BasicDBObject( "_id", "$_id");
groupFields.put("value", new BasicDBObject( "$sum", "$Entities.Sentiment.Value"));
groupFields.put("date", new BasicDBObject( "$first", "$Date"));
DBObject groupBy = new BasicDBObject("$group", groupFields );
stages.add(groupBy);
查找所有实体情绪值每个文档的总和。
Group
Date
现在count
,获取实体总价值的总和,以及每组 的文档 DBObject groupByDate = new BasicDBObject( "_id", "$date");
groupByDate.put("value",new BasicDBObject("$sum","$value"));
groupByDate.put("count",new BasicDBObject("$sum",1));
DBObject dtGrp = new BasicDBObject("$group", groupByDate );
stages.add(dtGrp);
。
Project
count
值作为value
和 DBObject project = new BasicDBObject("_id",1);
project.put("value",new BasicDBObject("$multiply",
new Object[]{"$value","$count"}));
stages.add(new BasicDBObject("$project",project));
的乘法结果,适用于每个组。
{{1}}
如果您的日期相差几毫秒,您需要按日期,年份和月份分组,在第二组阶段,并在必要时添加排序阶段。