是否可以使用MongoDB的聚合框架对多个列进行分组和汇总?

时间:2013-05-19 10:27:53

标签: mongodb aggregation-framework

鉴于此MongoDB集合:

[
  { character: 'broquaint', race: 'Halfling', class: 'Hunter' },
  { character: 'broquaint', race: 'Halfling', class: 'Hunter' },
  { character: 'broquaint', race: 'Halfling', class: 'Rogue' },
  { character: 'broquaint', race: 'Naga',     class: 'Fighter' },
  { character: 'broquaint', race: 'Naga',     class: 'Hunter' }
]

我想得到每个种族和班级的计数,即

{
  race:  { 'Halfling': 3, 'Naga': 2 },
  class: { 'Hunter': 3, 'Rogue': 1, 'Fighter': 1 }
}

我一直在尝试使用聚合框架(to 替换现有的地图/减少),但只能到达目前为止 获得组合的计数,即

{ '_id': { race: 'Halfling', class: 'Hunter' },  count: 2 }
{ '_id': { race: 'Halfling', class: 'Rogue' }    count: 1 }
{ '_id': { race: 'Naga',     class: 'Fighter' }, count: 1 }
{ '_id': { race: 'Naga',     class: 'Hunter' },  count: 1 }

这很简单,可以通过编程方式减少到所需的内容 结果,但我希望能够把它留给MongoDB。

这里参考我的代码到目前为止:

db.games.aggregate(
  { '$match': { character: 'broquaint' } },
  {
    '$group': {
      _id:   { race: '$race', background: '$background'},
      count: { '$sum': 1 }
    }
  }
)

所以问题是 - 鉴于示例集合我可以到达我的 纯粹通过MongoDB的聚合框架获得所需的输出?

如果有任何帮助可能会提前多多感谢!

2 个答案:

答案 0 :(得分:2)

是的,您可以使用聚合框架执行此操作。它不会很漂亮,但它会比使用mapreduce快得多......

简而言之(输出的格式与您提供的格式不同,但内容相同):

> group1 = {
    "$group" : {
        "_id" : "$race",
        "class" : {
            "$push" : "$class"
        },
        "count" : {
            "$sum" : 1
        }
    }
};
> unwind = { "$unwind" : "$class" };
> group2 = {
    "$group" : {
        "_id" : "$class",
        "classCount" : {
            "$sum" : 1
        },
        "races" : {
            "$push" : {
                "race" : "$_id",
                "raceCount" : "$count"
            }
        }
    }
};
> unwind2 = { "$unwind" : "$races" };
> group3 ={
    "$group" : {
        "_id" : 1,
        "classes" : {
            "$addToSet" : {
                "class" : "$_id",
                "classCount" : "$classCount"
            }
        },
        "races" : {
            "$addToSet" : "$races"
        }
    }
};
> db.races.aggregate(group1, unwind, group2, unwind2, group3);
{
    "result" : [
        {
            "_id" : 1,
            "classes" : [
                {
                    "class" : "Fighter",
                    "classCount" : 1
                },
                {
                    "class" : "Hunter",
                    "classCount" : 3
                },
                {
                    "class" : "Rogue",
                    "classCount" : 1
                }
            ],
            "races" : [
                {
                    "race" : "Naga",
                    "raceCount" : 2
                },
                {
                    "race" : "Halfling",
                    "raceCount" : 3
                }
            ]
        }
    ],
    "ok" : 1
}

答案 1 :(得分:2)

从MongoDB 3.4开始,使用$facet的多个聚合管道可以更简单地实现这一点。

取自docs

  

<强> $面

     

在单个阶段内处理多个聚合管道   同一组输入文件。每个子管道都有自己的字段   输出文档,其结果存储为数组   文档。

因此,对于您的用例,可以通过以下方式实现:

const aggregatorOpts = [
    { $match: { character: 'broquaint' } }, // Match the character
    {
        // Seperate into 2 or more pipes that will count class and
        // race seperatly
        $facet: {
            race: [
                // Group by race and get the count:
                // [
                //   {
                //     _id: 'Halfling',
                //     count: 3
                //   }
                //   {
                //     _id: 'Naga',
                //     count: 2
                //   }
                // ]

                // $sortByCount is the same as
                // { $group: { _id: <expression>, count: { $sum: 1 } } },
                // { $sort: { count: -1 } }

                { $sortByCount: '$race' },

                // Now we want to transform the array in to 1 document,
                // where the '_id' field is the key, and the 'count' is the value.
                // To achieve this we will use $arrayToObject. According the the
                // docs, we have to first rename the fields to 'k' for the key,
                // and 'v' for the value. We use $project for this:
                {
                    $project: {
                        _id: 0,
                        k: '$_id',
                        v: '$count',
                    },
                },
            ],
            // Same as above but for class instead
            class: [
                { $sortByCount: '$class' },
                {
                    $project: {
                        _id: 0,
                        k: '$_id',
                        v: '$count',
                    },
                },
            ],
        },
    },
    {
        // Now apply the $arrayToObject for both class and race.
        $addFields: {
            // Will override the existing class and race arrays
            // with their respective object representation instead.
            class: { $arrayToObject: '$class' },
            race: { $arrayToObject: '$race' },
        },
    },
];

db.races.aggregate(aggregatorOpts)

产生以下内容:

[
  {
    "race": {
      "Halfling": 3,
      "Naga": 2
    },
    "class": {
      "Hunter": 3,
      "Rogue": 1,
      "Fighter": 1,
    }
  }
]

如果您对@Asya提供的输出格式感到满意,那么您可以删除$project$addFields阶段,并将$sortByCount部分留在每个子管道中。< / p>

使用这些新功能,聚合更容易扩展,附加计数, 只需在$facet中添加另一个聚合管道即可。 计算子组甚至更容易,但这将是一个单独的问题。