我的收藏和文件的格式如下:
[
{
"brand" : "Toshiba",
"title" : "Toshiba Pors 7220CT / NW2",
"category" : "notebooks",
"code" : "ABCDTESTASD12",
"pid" : "45790"
},
{
"brand" : "Toshiba",
"title" : "Toshiba Satellite Pro 4600 PIII800",
"category" : "notebooks",
"ean" : "PATDSRESSSN12",
"pid" : "12345"
}
]
您能否建议我查询具有相同品牌,标题,类别和代码的独特文档,以便我可以在集合中看到独特的文档。
答案 0 :(得分:1)
您可以使用聚合框架中的group运算符:
db.computers.aggregate(
[
{
$group : {
_id : { brand: "$brand", title: "$title", category: "$category", code: "$code" },
count: { $sum: 1 }
}
}
]
)
答案 1 :(得分:0)
由于汇总管道阶段具有 maximum memory use limit ,因此请使用以下管道,通过将allowDiskUse
选项设置为true来处理大型数据集,从而可以将数据写入临时文件。
在管道中,使用$match
阶段过滤出欺骗行为,以便您只保留_id
可以查询的唯一文档:
var pipeline = [
{
"$group": { /* Group by fields to match on brand, title, category and code */
"_id": {
"brand": "$brand",
"title": "$title",
"category": "$category",
"code": "$code"
},
"count": { "$sum": 1 }, /* Count number of matching docs for the group */
"docs": { "$push": "$_id" }, /* Save the _id for matching docs */
"pids": { "$push": "$pid" } /* Save the matching pids to list */
}
},
{ "$match": { "count": 1 } }, /* filter out dupes */
{ "$out": "result" } /* Output aggregation results to another collection */
],
options = { "allowDiskUse": true, cursor: {} };
db.products.aggregate(pipeline, options); // Run the aggregation operation
db.result.find(); // Get the unique documents