我正在尝试将我的聚合查询转换为MongoDB
中的Map-Reduce(使用Ruby驱动程序)。在我的原始查询中,我搜索距离点一定距离的元素,然后按NEIGHBORHOOD
值对它们进行分组。
示例文档
{
"_id":"5b01a2c77b61e58732920f86",
"YEAR":2004,
"NEIGHBOURHOOD":"Grandview-Woodland",
"LOC":{"type":"Point","coordinates":[-123.067654,49.26773386]}
}
汇总查询
crime.aggregate([
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [ -123.0837633, 49.26980201 ]
},
"query": { "YEAR": 2004 },
"distanceField": "distance",
"minDistance": 10,
"maxDistance": 10000,
"num": 100000,
"spherical": true
}},
{ "$group": {
"_id": "$NEIGHBOURHOOD",
"count": { "$sum": 1 }
}}
])
因此输出的片段如下所示:
输出
{"_id"=>"Musqueam", "count"=>80}
{"_id"=>"West Point Grey", "count"=>651}
{"_id"=>"Marpole", "count"=>1367}
现在我正在尝试在MapReduce中制作这样的东西。在我的map function
中,我尝试检查文档是否在正确的距离内(基于THIS QUESTION的答案),如果是这样的话,请将它们传递给reduce function
,这些文档将被计算在内。但有些事情是不对的,我没有得到理想的结果 - count
值太大了。我做错了什么?
地图功能
map = "function() {" +
"var rad_per_deg = Math.PI/180;" +
"var rm = 6371 * 1000;" +
"var dlat_rad = (this.LOC.coordinates[0] - (-123.0837633)) * rad_per_deg;" +
"var dlon_rad = (this.LOC.coordinates[1] - (49.26980201)) * rad_per_deg;" +
"var lat1_rad = -123.0837633 * rad_per_deg;" +
"var lon1_rad = 49.26980201 * rad_per_deg;" +
"var lat2_rad = this.LOC.coordinates[0] * rad_per_deg;" +
"var lon2_rad = this.LOC.coordinates[1] * rad_per_deg;" +
"var a = Math.pow(Math.sin(dlat_rad/2), 2) + Math.cos(lat1_rad) * Math.cos(lat2_rad) * Math.pow(Math.sin(dlon_rad/2), 2);" +
"var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));" +
"if( rm * c < 10000) { " +
" emit(this.NEIGHBOURHOOD, {count: 1});" +
"}" +
"};"
减少功能
reduce = "function(key, values) { " +
"var sum = 0; " +
"values.forEach(function(f) { " +
" sum += f.count; " +
"}); " +
"return {count: sum};" +
"};"
查询
opts = {
query:{ "YEAR": 2004 },
:out => "results",
:raw => true
}
输出
crime.find().map_reduce(map, reduce, opts)
{"_id"=>"", "value"=>{"count"=>2257.0}}
{"_id"=>"Arbutus Ridge", "value"=>{"count"=>6066.0}}
{"_id"=>"Central Business District", "value"=>{"count"=>110947.0}}