我有以下格式的数据..
输入C_ID Assitor CollectionDate粒度计数器
A a 08 08-08-2012 00:00 15 0.9912378
B a 5 08-08-2012 00:00 15 0.1860929
C b 4 08-08-2012 00:00 15 0.5345317
D c 1 08-08-2012 00:15 15 0.8656529
E b 1 08-08-2012 00:15 15 0.3249502
A a 08 08-08-2012 00:15 15 0.3743117
B a 5 08-08-2012 00:30 15 0.2608622
C b 4 08-08-2012 00:30 15 0.0079308
D c 1 08-08-2012 00:30 15 0.7094781
E b 1 08-08-2012 00:45 15 0.6133461
A a 08 08-08-2012 00:45 15 0.3035875
B a 5 08-08-2012 00:45 15 0.6093015
C b 4 08-08-2012 01:00 15 0.4104008
D c 1 08-08-2012 01:00 15 0.1687753
E b 1 08-08-2012 01:00 15 0.6627076
A a 08 08-08-2012 01:15 15 0.1901386 .....
依旧......
我想在每小时的基础上对此表进行增量mapreduce ..在CollectionDate是显示记录何时到来的字段..但是我想要的所有这些代码在c#.net
我已经完成了mapReduce,但问题是我每15分钟收到3条记录,每小时收到12条记录,1小时后这15条记录将会减少..而在接下来的1小时之后,其余的记录将在相同的基础上减少..
我是否可以在c#.net上获得此方面的帮助..自从过去20天以来我一直陷入困境..
它多余的csv文件..从我获取记录..在mongodb中使用c#..插入mongodb它看起来像这样的东西: {“_ id”:a324b2f89d2e98fa21f,“Type”:A,“C_ID”:a,“assitor”:10,“CollectionDate”:08-08-2012 00:00,“Granulity”:15,“Counter”:0.1901386} {“_ id”:a324b2f89d2e98a216f,“Type”:B,“C_ID”:a,“assitor”:10,“CollectionDate”:08-08-2012 00:00,“Granulity”:15,“Counter”:0.1233542} {“_ id”:a324b2f89d2e98a3f2c,“Type”:A,“C_ID”:b,“assitor”:12,“CollectionDate”:08-08-2012 00:15,“Granulity”:12,“Counter”:0.8134552} {“_ id”:a324b2f89d2e98b4e2d,“Type”:B,“C_ID”:b,“assitor”:12,“CollectionDate”:08-08-2012 00:15,“Granulity”:12,“Counter”:0.3218547}
OUTPUTFILE: {“_ id”:a8f3e231d456a675b23c,“CollectionDate”:08-08-2012 00:00“AvgCounter”:} {“_ id”:a8f3e232456a675a42cd,“CollectionDate”:08-08-2012 01:00“AvgCounter”:} {“_ id”:a8f3e231d46a67a0b4d2,“CollectionDate”:08-08-2012 02:00“AvgCounter”:}
表示每小时聚合..
直到我做了什么...
private static void MapReduce(MongoDatabase db, String collName, BsonValue bsonValue, DateTime oldDateTime, DateTime newDateTime)
{
var collection = db.GetCollection<BsonDocument>(collName);
Console.WriteLine(TotalReduction++);
String map = @"function() {
var sample = this;
emit(sample.CollectionDate, {CID: sample.C_ID, count:1, CollectionTime: sample.CollectionDate});
}";
String reduce = @"function(key, values) {
var result = {CID: '', count:0};
values.forEach(function(value){
result.CID += value.CID;
result.count += value.count;
result.CollectionTime = value.CollectionTime;
});
return result;
}";
var options = new MapReduceOptionsBuilder();
IMongoQuery[] queries = { Query.EQ("CollectionTime", bsonValue) };
options.SetOutput(MapReduceOutput.Inline);
IMongoQuery query = Query.And(queries);
var results = collection.MapReduce(queries[0], map, reduce);
collection = db.GetCollection<BsonDocument>("MSS_REDUCE");
IEnumerable<BsonDocument> bdoc = results.GetResultsAs<BsonDocument>();
collection.InsertBatch<BsonDocument>(bdoc);
}
谢谢Ravi Sharma
答案 0 :(得分:0)
所以,我对你的数据集仍然有点不清楚。我可以指出一件事,希望它会有所帮助......
在地图中,您没有发出要分组的实际日期,而是发布其中所有内容的日期。相反,您应该发出一个删除了分钟,秒和毫秒的密钥。此外,如果您打算按_id进行分组,则还需要发出该分组。
String map = @"function() {
var sample = this;
var d = sample.CollectionDate;
var newCollectionDate = new Date(d.getFullYear(), d.getMonth(), d.getDate(), d.getHours(), 0, 0, 0);
emit({C_ID: sample.C_ID, CollectionDate: newCollectionDate}, {C_ID: sample.C_ID, Count: 1, CounterSum: sample.Counter, CounterAverage: 0, CollectionDate: newCollectionDate});
}";
然后,您的reduce函数需要跟踪计数和值的总和。
String reduce = @"function(key, values) {
var result = {C_ID: key.C_ID, Count:0, CounterSum: 0, CounterAverage: 0, CollectionDate: key.CollectionDate};
values.forEach(function(value){
result.Count += value.Count;
result.CounterSum += value.CounterSum;
});
return result;
}";
你还需要一个finalize方法来进行平均......
String finalize = @"function(key, value) {
if(value.Count > 0) {
value.CounterAverage = value.CounterSum / value.Count;
}
return value;
}";
希望这能让你到达目的地。