Question

我想根据子文档进行过滤，但实际上我正在为每个子文档重复文档。如果是这种情况，我想要一份文件和一份子文件清单。

我的数据如下：

{
    "_id" : ObjectId("582eeb5f75f58055246bd22d"),
    "filename" : "file1",
    "cod" : NumberLong(90),
    "subdocs" : [
        {
            "length" : NumberLong(10),
            "desc" : "000"
        },
        {
            "length" : NumberLong(15),
            "desc" : "011"
        },
        {
            "length" : NumberLong(30),
            "desc" : "038"
        }
    ]
}
{
    "_id" : ObjectId("582eeb5f75f58055246bd22e"),
    "filename" : "file2",
    "cod" : NumberLong(95),
    "subdocs" : [
        {
            "length" : NumberLong(11),
            "desc" : "000"
        },
        {
            "length" : NumberLong(21),
            "desc" : "018"
        },
        {
            "length" : NumberLong(41),
            "desc" : "008"
        }
    ]
}

我正在使用此查询过滤 subdocs

上的 desc （000,011）

db.ftmp.aggregate( 
    { $match: 
        { "subdocs.desc": 
            { $in: ["000", "011"] } 
        }
    }, 
    { $unwind : "$subdocs" }, 
    { $match : 
        { "subdocs.desc" : 
            { $in:["000", "011"] } 
        }
    }
)

但结果显示3个文档，每个子文档的1个文档与该查询匹配。

{
    "_id" : ObjectId("582eeb5f75f58055246bd22d"),
    "filename" : "file1",
    "cod" : NumberLong(90),
    "subdocs" : {
        "length" : NumberLong(10),
        "desc" : "000"
    }
}
{
    "_id" : ObjectId("582eeb5f75f58055246bd22d"),
    "filename" : "file1",
    "cod" : NumberLong(90),
    "subdocs" : {
        "length" : NumberLong(15),
        "desc" : "011"
    }
}
{
    "_id" : ObjectId("582eeb5f75f58055246bd22e"),
    "filename" : "file2",
    "cod" : NumberLong(95),
    "subdocs" : {
        "length" : NumberLong(11),
        "desc" : "000"
    }
}

但是我想得到：file1带有desc 000和011的子文档，file2带有subdocumnt 000

{
    "_id" : ObjectId("582eeb5f75f58055246bd22d"),
    "filename" : "file1",
    "cod" : NumberLong(90),
    "subdocs" : [
        {
            "length" : NumberLong(10),
            "desc" : "000"
        },
        {
            "length" : NumberLong(15),
            "desc" : "011"
        }
    ]
}
{
    "_id" : ObjectId("582eeb5f75f58055246bd22e"),
    "filename" : "file2",
    "cod" : NumberLong(95),
    "subdocs" : {
        "length" : NumberLong(11),
        "desc" : "000"
    }
}

这样做的正确方法是什么？有什么想法吗？

Answer 1

首先使用此$unwind中提到的answer运算符会导致应用程序性能下降，因为展开数组会导致更多文档在管道中处理。自MongoDB 2.6以来，有更好的方法来实现这一目标。

话虽如此，对于MongoDB 3.2中的$filter运算符来说，这是一个完美的工作。

最有效的方法是使用MongoDB 3.4。 MongoDB 3.4为聚合框架引入了$in数组运算符，可以在$filter cond itional表达式中使用，当计算结果为true时，在结果数组中包含子文档。 / p>

let values = [ '000', '011' ];

db.collection.aggregate([ 
    { "$project": { 
        "filename": 1, 
        "cod": 1, 
        "subdocs": { 
            "$filter": { 
                "input": "$subdocs", 
                "as": "s", 
                "cond": { "$in": [ "$$s.desc", values ] }
            } 
        } 
    }} 
])

在MongoDB 3.2中，我们需要稍微不同的方法，因为我们可以在那里使用$in运算符。但幸运的是我们有$setIsSubset运算符，你可能猜测在两个数组上执行set操作，如果第一个数组是第二个数组的子集，则返回true。因为$setIsSubset第一个表达式必须是数组，所以需要在我们的管道中使desc字段成为一个数组。为此，我们只需使用[]括号创建array field which is new MongoDB 3.2

db.collection.aggregate([ 
    { "$project": { 
        "filename": 1, 
        "cod": 1, 
        "subdocs": { 
            "$filter": { 
                "input": "$subdocs", 
                "as": "s", 
                "cond": { "$setIsSubset": [ [ "$$s.desc" ], values ] }
            } 
        } 
    }} 
])

MongoDB 3.0对我来说已经死了但是如果由于某些原因你正在运行该版本，你可以使用$literal运算符返回set操作和$setDifference运算符所需的一个元素数组。这留给读者练习。

Answer 2

你只需要添加$ group＆amp; $推动。首先，你展开子磁盘以应用$ match，后跟$ group on id和$ push the groups subdocs。

db.ftmp.aggregate({
    $unwind: "$subdocs"
}, {
    $match: {
        "subdocs.desc": {
            $in: ["000", "011"]
        }
    }
}, {
    $group: {
        _id: "$_id",
        subdocs: {
            $push: "$subdocs"
        },
        filename: {
            $first: "$filename"
        },
        cod: {
            $first: "$cod"
        }
    }
})

选择字段值在某个数组中的子文档

2 个答案: