Mongodb组通过多个键来获取集合中的唯一文档

时间:2015-11-27 10:05:07

标签: mongodb mongodb-query

我的收藏和文件的格式如下:

[
 {

    "brand" : "Toshiba",
    "title" : "Toshiba Pors 7220CT / NW2",
    "category" : "notebooks",
    "code" : "ABCDTESTASD12",
    "pid" : "45790"
 }, 
 {
    "brand" : "Toshiba",
    "title" : "Toshiba Satellite Pro 4600 PIII800",
    "category" : "notebooks",
    "ean" : "PATDSRESSSN12",
    "pid" : "12345"
 }
]

您能否建议我查询具有相同品牌,标题,类别和代码的独特文档,以便我可以在集合中看到独特的文档。

2 个答案:

答案 0 :(得分:1)

您可以使用聚合框架中的group运算符:

db.computers.aggregate(
    [
        {
            $group : {
                _id : { brand: "$brand", title: "$title", category: "$category", code: "$code" },
                count: { $sum: 1 }
            }
        }
    ]
)

答案 1 :(得分:0)

由于汇总管道阶段具有 maximum memory use limit ,因此请使用以下管道,通过将allowDiskUse选项设置为true来处理大型数据集,从而可以将数据写入临时文件。 在管道中,使用$match阶段过滤出欺骗行为,以便您只保留_id可以查询的唯一文档:

var pipeline = [
    {
        "$group": { /* Group by fields to match on brand, title, category and code */
            "_id": { 
                "brand": "$brand", 
                "title": "$title", 
                "category": "$category", 
                "code": "$code" 
            },
            "count": { "$sum": 1 }, /* Count number of matching docs for the group */
            "docs": { "$push": "$_id" }, /* Save the _id for matching docs */
            "pids": { "$push": "$pid" } /* Save the matching pids to list */
        }
    },
    { "$match": { "count": 1 } }, /* filter out dupes */
    { "$out": "result" } /* Output aggregation results to another collection */
],
options = { "allowDiskUse": true, cursor: {} };

db.products.aggregate(pipeline, options); // Run the aggregation operation

db.result.find(); // Get the unique documents