如何用mongodb计算加性属性的比率?

时间:2012-10-30 06:11:41

标签: mongodb mapreduce aggregation-framework

使用示例mongodb聚合集合(http://media.mongodb.org/zips.json),我想输出加利福尼亚州每个城市的人口比例。

在SQL中,它可能如下所示:

SELECT city, population/SUM(population) as poppct
FROM (
    SELECT city, SUM(population) as population
    FROM zipcodes
    WHERE state='CA'
    GROUP BY city
) agg group by state;

这可以使用mongodb map / reduce:

来完成
db.runCommand({
   mapreduce : "zipcodes"
   , out : { inline : 1}
   , query : {state: "CA"}
   , map : function() { 
       emit(this.city, this.pop); 
       cache.totalpop = cache.totalpop || 0; 
       cache.totalpop += this.pop; 
     }
   , reduce : function(key, values) {
       var pop = 0;
       values.forEach(function(value) {
          if (value && typeof value == 'number' && value > 0) pop += value;
       });
       return pop;
     }
   , finalize: function(key, reduced) {
       return reduced/cache.totalpop;
     }
   , scope: { cache: { } }
});

使用新的聚合框架(v2.2)是否也可以实现这一目标?这将需要某种形式的全局范围,如map / reduce情况。

感谢。

2 个答案:

答案 0 :(得分:0)

这就是你要追求的吗?

db.zipcodes.remove();
db.zipcodes.insert([
    { city:"birmingham", population:1500000, state:"AL" },
    { city:"London", population:10000, state:"ON" },
    { city:"New York", population:1000, state:"NY" },
    { city:"Denver", population:100, state:"CO" },
    { city:"Los Angeles", population:1000000, state:"CA" },
    { city:"San Francisco", population:2000000, state:"CA" },
]);

db.zipcodes.runCommand("aggregate", { pipeline: [
    { $match: { state: "CA" } }, // WHERE state='CA'
    { $group: {
        _id: "$city",            // GROUP BY city
        population: { $sum: "$population" }, // SUM(population) as population
    }},
]});

产生

{
    "result" : [
            {
                "_id" : "San Francisco",
                "population" : 2000000
            },
            {
                "_id" : "Los Angeles",
                "population" : 1000000
            }
    ],
    "ok" : 1
}

答案 1 :(得分:0)

你可以尝试:

db.zipcodes.group( { key: { state:1 } , 
                  reduce: function(curr, result) { 
                              result.total += curr.pop; 
                              result.city.push( { _id: curr.city, pop: curr.pop } ); }, 
                 initial: { total: 0, city:[] }, 
                 finalize: function (result) {
                              for (var idx in result.city ) {
                                 result.city[idx].ratio = result.city[idx].pop/result.total;
                               } 

                              } } )