我正在执行州的人口统计并获得原始输出的额外文档。要检查我发现mappers会生成中间数据的原因远远超过mongodb中的原始数据。我该如何解决这个问题?源集合中的文档总数为29468。
数据集中的示例:
{ "city" : "SPLENDORA", "loc" : [ -95.199308, 30.232609 ], "pop" : 11287, "state" : "TX", "_id" : "77372" }
{ "city" : "SPRING", "loc" : [ -95.377329, 30.053241 ], "pop" : 33118, "state" : "TX", "_id" : "77373" }
{ "city" : "TOMBALL", "loc" : [ -95.62006, 30.073923 ], "pop" : 19801, "state" : "TX", "_id" : "77375" }
{ "city" : "WILLIS", "loc" : [ -95.497583, 30.432025 ], "pop" : 9988, "state" : "TX", "_id" : "77378" }
{ "city" : "KLEIN", "loc" : [ -95.528481, 30.023377 ], "pop" : 35275, "state" : "TX", "_id" : "77379" }
{ "city" : "CONROE", "loc" : [ -95.492392, 30.225725 ], "pop" : 1635, "state" : "TX", "_id" : "77384" }
地图功能:
var m=function(){ emit(this.city,this.pop);}
减少功能:
var r=function(c,p){ return p;}
MR输出到新集合:
{ "_id" : "81080", "value" : 172 }
{ "_id" : "81250", "value" : 467 }
{ "_id" : "82057", "value" : 60 }
{ "_id" : "95411", "value" : 133 }
{ "_id" : "95414", "value" : 226 }
{ "_id" : "95440", "value" : 2876 }
{ "_id" : "95455", "value" : 843 }
{ "_id" : "95467", "value" : 328 }
{ "_id" : "95489", "value" : 358 }
{ "_id" : "95495", "value" : 367 }
{ "_id" : "98791", "value" : 5345 }
{ "_id" : "PLEASANT GROVE", "value" : [ 8458, 15703, 80, 772,
{ "_id" : "POINTBLANK", "value" : 2911 }
{ "_id" : "PORTER", "value" : [ 13541, 19024, 985, 425, 2705 ]
{ "_id" : "SHEPHERD", "value" : [ 9604, 17397, 2078 ] }
{ "_id" : "SPLENDORA", "value" : 11287 }
{ "_id" : "SPRING", "value" : [ 33118, 8379, 21805, 8540 ] }
{ "_id" : "TOMBALL", "value" : 19801 }
{ "_id" : "WILLIS", "value" : [ 9988, 2769, 2574 ] }
{ "_id" : "KLEIN", "value" : 35275 }
答案 0 :(得分:0)
由于reduce
功能不正确,您的输出不符合预期。 reduce
函数的原型为function(key,values) {...}
,其中values
是与key
关联的数组。
您的reduce
函数正在返回values
数组而不是减少它。
要总结给定键的值,您的reduce()
函数应如下所示:
var r=function(key, values) {
return Array.sum(values);
}
如果你想按州计算人口,你的map()
功能也是错误的:你应该发出状态&人口而不是城市和人口人口:
var m=function() {
emit(this.state,this.pop);
}
将它们放在一起,你的输出应该看起来像:
{
"_id" : "AK",
"value" : 550043
},
{
"_id" : "AL",
"value" : 4040587
},
{
"_id" : "AR",
"value" : 2350725
}
...
MongoDB手册提供了有关编写和测试reduce
功能的更多详细信息: