我有这个示例项目集合:
{
"_id": "1",
"field1": "value1",
"field2": "value2",
"category": "phones",
"user": "1",
"tags": [
"tag1",
"tag3"
]
},
{
"_id": "2",
"field1": "value1",
"field2": "value2",
"category": "phones",
"user": "1",
"tags": [
"tag2",
"tag3"
]
},
{
"_id": "3",
"field1": "value1",
"field2": "value2",
"category": "bikes",
"user": "1",
"tags": [
"tag3",
"tag4"
]
},
{
"_id": "4",
"field1": "value1",
"field2": "value2",
"category": "cars",
"user": "2",
"tags": [
"tag1",
"tag2"
]
}
我会搜索特定用户创建的项目(即用户:1)并按类别字段显示它们。结果:
{
"phones": [
{
"_id": "1",
"field1": "value1",
"field2": "value2",
"tags": [
"tag1",
"tag3"
]
},
{
"_id": "2",
"field1": "value1",
"field2": "value2",
"tags": [
"tag2",
"tag3"
]
}
],
"bikes" : [
{
"_id": "3",
"field1": "value1",
"field2": "value2",
"tags": [
"tag3",
"tag4"
]
}
]
}
是否可以通过聚合组功能获得此方案? 谢谢你
答案 0 :(得分:1)
可以按类别进行分组,但不能按照您提供的方式进行分组。这真是一件好事,因为你的"类别"实际上是数据,你真的不应该代表数据"作为"键",在您的存储空间或输出中。
所以我们真的建议像这样进行转换:
db.collection.aggregate([
{ "$match": { "user": 1 } },
{ "$group": {
"_id": "$category",
"items": {
"$push": {
"field1": "$field1",
"field2": "$field2",
"tags": "$tags"
}
}
}},
{ "$group": {
"_id": null,
"categories": {
"$push": {
"_id": "$_id",
"items": "$items"
}
}
}}
])
你得到这样的输出:
{
"_id" : null,
"categories" : [
{
"_id" : "bikes",
"items" : [
{
"_id": 3,
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag3",
"tag4"
]
}
]
},
{
"_id" : "phones",
"items" : [
{
"_id": 1,
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag1",
"tag3"
]
},
{
"_id": 2,
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag2",
"tag3"
]
}
]
}
]
}
使用不随更改数据更改的通用键名称确实更好。这实际上是面向对象的模式。
如果你真的认为你需要"数据作为键"在这里,对于聚合框架,你要么知道"类别"您期望或准备好生成管道阶段:
db.utest.aggregate([
{ "$match": { "user": "1" } },
{ "$group": {
"_id": null,
"phones": {
"$push": {
"$cond": [
{ "$eq": ["$category","phones"] },
{
"_id": "$_id",
"field1": "$field1",
"field2": "$field2",
"tags": "$tags"
},
false
]
}
},
"bikes": {
"$push": {
"$cond": [
{ "$eq": ["$category","bikes"] },
{
"_id": "$_id",
"field1": "$field1",
"field2": "$field2",
"tags": "$tags"
},
false
]
}
}
}},
{ "$unwind": "$phones" },
{ "$match": { "phones": { "$ne": false } }},
{ "$group": {
"_id": "$_id",
"phones": { "$push": "$phones" },
"bikes": { "$first": "$bikes" }
}},
{ "$unwind": "$bikes" },
{ "$match": { "bikes": { "$ne": false } }},
{ "$group": {
"_id": "$_id",
"phones": { "$first": "$phones" },
"bikes": { "$push": "$bikes" }
}},
{ "$project": {
"_id": 0,
"phones": 1,
"bikes": 1
}}
])
您可以使用MongoDB 2.6缩短一点,因为您只需使用$setDifference
运算符过滤掉false
值:
db.collection.aggregate([
{ "$match": { "user": "1" } },
{ "$group": {
"_id": null,
"phones": {
"$push": {
"$cond": [
{ "$eq": ["$category","phones"] },
{
"_id": "$_id",
"field1": "$field1",
"field2": "$field2",
"tags": "$tags"
},
false
]
}
},
"bikes": {
"$push": {
"$cond": [
{ "$eq": ["$category","bikes"] },
{
"_id": "$_id",
"field1": "$field1",
"field2": "$field2",
"tags": "$tags"
},
false
]
}
}
}},
{ "$project": {
"_id": 0,
"phones": { "$setDifference": ["$phones",[false]] },
"bikes": { "$setDifference": ["$bikes",[false]] }
}}
])
两者都可以按照您的需要生成输出:
{
"phones" : [
{
"_id" : "1",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag1",
"tag3"
]
},
{
"_id" : "2",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag2",
"tag3"
]
}
],
"bikes" : [
{
"_id" : "3",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag3",
"tag4"
]
}
]
}
这里的一般情况是聚合框架只是赢了允许将字段数据用作密钥,因此您需要只对数据进行分组或自己指定密钥名称。
你获得的唯一方式"动态"键名是使用mapReduce代替:
db.collection.mapReduce(
function () {
var obj = { };
var category = this.category;
delete this.user;
delete this.category;
obj[category] = [this];
emit(null,obj);
},
function (key,values) {
var reduced = {};
values.forEach(function(value) {
Object.keys(value).forEach(function(key) {
if ( !reduced.hasOwnProperty(key) )
reduced[key] = [];
value[key].forEach(function(item) {
reduced[key].push(item);
});
});
});
return reduced;
},
{
"query": { "user": "1" },
"out": { "inline": 1 }
}
)
所以现在密钥生成是动态的,但输出是以mapReduce的方式完成的:
{
"_id" : null,
"value" : {
"phones" : [
{
"_id" : "1",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag1",
"tag3"
]
},
{
"_id" : "2",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag2",
"tag3"
]
}
],
"bikes" : [
{
"_id" : "3",
"field1" : "value1",
"field2" : "value2",
"tags" : [
"tag3",
"tag4"
]
}
]
}
}
因此,输出受mapReduce指示outut的限制,并且此处评估JavaScript将比聚合框架的本机操作慢。操纵权力更大,但这是权衡。
总而言之,如果您坚持使用模式,那么使用聚合框架的第一种方法是执行此操作的最快和最佳方式,此外,您可以始终重新构建从服务器返回的结果。如果您坚持打破模式并需要动态密钥来自服务器,那么mapReduce会在其他聚合框架被认为不切实际的情况下执行此操作。