Question

我有这个示例项目集合：

{
  "_id": "1",
  "field1": "value1",
  "field2": "value2",
  "category": "phones",
  "user": "1",
  "tags": [
    "tag1",
    "tag3"
  ]
},
{
  "_id": "2",
  "field1": "value1",
  "field2": "value2",
  "category": "phones",
  "user": "1",
  "tags": [
    "tag2",
    "tag3"
  ]
},
{
  "_id": "3",
  "field1": "value1",
  "field2": "value2",
  "category": "bikes",
  "user": "1",
  "tags": [
    "tag3",
    "tag4"
  ]
},
{
  "_id": "4",
  "field1": "value1",
  "field2": "value2",
  "category": "cars",
  "user": "2",
  "tags": [
    "tag1",
    "tag2"
  ]
}

我会搜索特定用户创建的项目（即用户：1）并按类别字段显示它们。结果：

{
  "phones": [
      {
        "_id": "1",
        "field1": "value1",
        "field2": "value2",
        "tags": [
          "tag1",
          "tag3"
         ]
      },
      {
        "_id": "2",
        "field1": "value1",
        "field2": "value2",
        "tags": [
          "tag2",
          "tag3"
         ]
      }
  ],
  "bikes" : [
      {
        "_id": "3",
        "field1": "value1",
        "field2": "value2",
        "tags": [
          "tag3",
          "tag4"
         ]
      }
  ]

}

是否可以通过聚合组功能获得此方案？谢谢你

Answer 1

可以按类别进行分组，但不能按照您提供的方式进行分组。这真是一件好事，因为你的＆＃34;类别＆＃34;实际上是数据，你真的不应该代表数据＆＃34;作为＆＃34;键＆＃34;，在您的存储空间或输出中。

所以我们真的建议像这样进行转换：

db.collection.aggregate([
    { "$match": { "user": 1 } },
    { "$group": {
        "_id": "$category",
        "items": { 
            "$push": {
                "field1": "$field1",
                "field2": "$field2",
                "tags": "$tags"
            }
        }
    }},
    { "$group": {
        "_id": null,
        "categories": { 
            "$push": {
                "_id": "$_id",
                "items": "$items"
            }
        }
    }}
])

你得到这样的输出：

{
    "_id" : null,
    "categories" : [
        {
            "_id" : "bikes",
            "items" : [
                {
                    "_id": 3,
                    "field1" : "value1",
                    "field2" : "value2",
                    "tags" : [
                        "tag3",
                        "tag4"
                    ]
                }
            ]
        },
        {
            "_id" : "phones",
            "items" : [
                {
                    "_id": 1,
                    "field1" : "value1",
                    "field2" : "value2",
                    "tags" : [
                        "tag1",
                        "tag3"
                    ]
                },
                {
                    "_id": 2,
                    "field1" : "value1",
                    "field2" : "value2",
                    "tags" : [
                        "tag2",
                        "tag3"
                    ]
                }
            ]
        }
    ]
}

使用不随更改数据更改的通用键名称确实更好。这实际上是面向对象的模式。

如果你真的认为你需要＆＃34;数据作为键＆＃34;在这里，对于聚合框架，你要么知道＆＃34;类别＆＃34;您期望或准备好生成管道阶段：

db.utest.aggregate([
    { "$match": { "user": "1" } },
    { "$group": {
        "_id": null,
        "phones": {
            "$push": {
                "$cond": [
                    { "$eq": ["$category","phones"] },
                    {
                        "_id": "$_id",
                        "field1": "$field1",
                        "field2": "$field2",
                        "tags": "$tags"
                    },
                    false
                ]
            }
        },
        "bikes": {
            "$push": {
                "$cond": [
                    { "$eq": ["$category","bikes"] },
                    {
                        "_id": "$_id",
                        "field1": "$field1",
                        "field2": "$field2",
                        "tags": "$tags"
                    },
                    false
                ]
            }
        }           
    }},
    { "$unwind": "$phones" },
    { "$match": { "phones": { "$ne": false } }},
    { "$group": {
        "_id": "$_id",
        "phones": { "$push": "$phones" },
        "bikes": { "$first": "$bikes" }
    }},
    { "$unwind": "$bikes" },
    { "$match": { "bikes": { "$ne": false } }},
    { "$group": {
        "_id": "$_id",
        "phones": { "$first": "$phones" },
        "bikes": { "$push": "$bikes" }
    }},
    { "$project": {
        "_id": 0,
        "phones": 1,
        "bikes": 1
    }}
])

您可以使用MongoDB 2.6缩短一点，因为您只需使用$setDifference运算符过滤掉false值：

db.collection.aggregate([
    { "$match": { "user": "1" } },
    { "$group": {
        "_id": null,
        "phones": {
            "$push": {
                "$cond": [
                    { "$eq": ["$category","phones"] },
                    {
                        "_id": "$_id",
                        "field1": "$field1",
                        "field2": "$field2",
                        "tags": "$tags"
                    },
                    false
                ]
            }
        },
        "bikes": {
            "$push": {
                "$cond": [
                    { "$eq": ["$category","bikes"] },
                    {
                        "_id": "$_id",
                        "field1": "$field1",
                        "field2": "$field2",
                        "tags": "$tags"
                    },
                    false
                ]
            }
        }           
    }},
    { "$project": {
        "_id": 0,
        "phones": { "$setDifference": ["$phones",[false]] },
        "bikes": { "$setDifference": ["$bikes",[false]] }
    }}
])

两者都可以按照您的需要生成输出：

{
    "phones" : [
        {
            "_id" : "1",
            "field1" : "value1",
            "field2" : "value2",
            "tags" : [
                "tag1",
                "tag3"
            ]
        },
        {
            "_id" : "2",
            "field1" : "value1",
            "field2" : "value2",
            "tags" : [
                "tag2",
                "tag3"
            ]
        }
    ],
    "bikes" : [
        {
            "_id" : "3",
            "field1" : "value1",
            "field2" : "value2",
            "tags" : [
                "tag3",
                "tag4"
            ]
        }
    ]
}

这里的一般情况是聚合框架只是赢了允许将字段数据用作密钥，因此您需要只对数据进行分组或自己指定密钥名称。

你获得的唯一方式＆＃34;动态＆＃34;键名是使用mapReduce代替：

db.collection.mapReduce(
    function () {
      var obj = { };
      var category = this.category;
      delete this.user;
      delete this.category;

      obj[category] = [this];

      emit(null,obj);
    },
    function (key,values) {

      var reduced = {};

      values.forEach(function(value) {
        Object.keys(value).forEach(function(key) {
          if ( !reduced.hasOwnProperty(key) )
            reduced[key] = [];
          value[key].forEach(function(item) {
            reduced[key].push(item);
          });
        });
      });

      return reduced;

    },
    {
        "query": { "user": "1" },
        "out": { "inline": 1 }
    }
)

所以现在密钥生成是动态的，但输出是以mapReduce的方式完成的：

{
    "_id" : null,
    "value" : {
        "phones" : [
            {
                "_id" : "1",
                "field1" : "value1",
                "field2" : "value2",
                "tags" : [
                    "tag1",
                    "tag3"
                ]
            },
            {
                "_id" : "2",
                "field1" : "value1",
                "field2" : "value2",
                "tags" : [
                    "tag2",
                    "tag3"
                ]
            }
        ],
        "bikes" : [
            {
                "_id" : "3",
                "field1" : "value1",
                "field2" : "value2",
                "tags" : [
                    "tag3",
                    "tag4"
                ]
            }
        ]
    }
}

因此，输出受mapReduce指示outut的限制，并且此处评估JavaScript将比聚合框架的本机操作慢。操纵权力更大，但这是权衡。

总而言之，如果您坚持使用模式，那么使用聚合框架的第一种方法是执行此操作的最快和最佳方式，此外，您可以始终重新构建从服务器返回的结果。如果您坚持打破模式并需要动态密钥来自服务器，那么mapReduce会在其他聚合框架被认为不切实际的情况下执行此操作。

显示按字段分组的项目

1 个答案: