Mongo Map-Reduce:按值分组一定距离的文档

时间:2018-05-21 15:51:47

标签: ruby mongodb mapreduce

我正在尝试将我的聚合查询转换为MongoDB中的Map-Reduce(使用Ruby驱动程序)。在我的原始查询中,我搜索距离点一定距离的元素,然后按NEIGHBORHOOD值对它们进行分组。

示例文档

{
 "_id":"5b01a2c77b61e58732920f86",
 "YEAR":2004,
 "NEIGHBOURHOOD":"Grandview-Woodland",
 "LOC":{"type":"Point","coordinates":[-123.067654,49.26773386]}
}

汇总查询

crime.aggregate([
{ "$geoNear": {
"near": { 
  "type": "Point", 
  "coordinates": [ -123.0837633, 49.26980201 ]
},
"query": { "YEAR": 2004 },
"distanceField": "distance",
"minDistance": 10,
"maxDistance": 10000,
"num": 100000,
"spherical": true
}},
{ "$group": {   
   "_id": "$NEIGHBOURHOOD",
   "count": { "$sum": 1 } 
}}
])

因此输出的片段如下所示:

输出

{"_id"=>"Musqueam", "count"=>80}
{"_id"=>"West Point Grey", "count"=>651}
{"_id"=>"Marpole", "count"=>1367}

现在我正在尝试在MapReduce中制作这样的东西。在我的map function中,我尝试检查文档是否在正确的距离内(基于THIS QUESTION的答案),如果是这样的话,请将它们传递给reduce function,这些文档将被计算在内。但有些事情是不对的,我没有得到理想的结果 - count值太大了。我做错了什么?

地图功能

 map = "function() {" +
  "var rad_per_deg = Math.PI/180;" +
  "var rm = 6371 * 1000;" +
  "var dlat_rad = (this.LOC.coordinates[0] - (-123.0837633)) * rad_per_deg;" +
  "var dlon_rad = (this.LOC.coordinates[1] - (49.26980201)) * rad_per_deg;" +
  "var lat1_rad = -123.0837633 * rad_per_deg;" +
  "var lon1_rad = 49.26980201 * rad_per_deg;" +
  "var lat2_rad = this.LOC.coordinates[0] * rad_per_deg;" +
  "var lon2_rad = this.LOC.coordinates[1] * rad_per_deg;" +
  "var a = Math.pow(Math.sin(dlat_rad/2), 2) + Math.cos(lat1_rad) * Math.cos(lat2_rad) * Math.pow(Math.sin(dlon_rad/2), 2);" +
  "var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));" +
  "if( rm * c < 10000) { " + 
  " emit(this.NEIGHBOURHOOD, {count: 1});" +
  "}" +
  "};"

减少功能

reduce = "function(key, values) { " +
  "var sum = 0; " +
  "values.forEach(function(f) { " +
  " sum += f.count; " +
  "}); " +
  "return {count: sum};" +
  "};"

查询

 opts =  {
    query:{ "YEAR": 2004 },
    :out => "results", 
    :raw => true
  } 

输出

 crime.find().map_reduce(map, reduce, opts)

 {"_id"=>"", "value"=>{"count"=>2257.0}}
 {"_id"=>"Arbutus Ridge", "value"=>{"count"=>6066.0}}
 {"_id"=>"Central Business District", "value"=>{"count"=>110947.0}}

0 个答案:

没有答案