查询MongoDB以仅返回具有匹配条件的单个值的文档

时间:2017-12-05 10:35:41

标签: mongodb

我希望MongoDB返回一个完整的文档,如果它只有一个匹配条件的数组元素。我写了以下内容:

extension NSAttributedString {
func rangeOf(string: String) -> Range<String.Index>? {
    return self.string.range(of: string)
}}

它工作得很好,除了它很慢,因为db.myCollection.find({ $where: "this.Tags.filter(x => x.indexOf(':') < 0).length === 1" }) 子句不使用指标。

是否可以以某种方式将此查询重写为能够识别指标的正常$where / find / match操作,或者这是执行此类操作的唯一方法?我可以defenitly添加一些像...这样的字段,但我的问题是关于更通用的方法,它不需要更改数据的插入方式。

2 个答案:

答案 0 :(得分:1)

经过几个小时的谷歌搜索和堆栈溢出后,我写了以下解决方案:

db.myCollection.aggregate([
    { $match : { "Tags": ":image" } },
    { $unwind : "$Tags" },
    { $match : { "Tags": /^[^:]+$/ } },
    { $group : { _id : "$_id", doc: { "$first": "$$ROOT" }, count: { $sum : 1} }} ,
    { $match : { "count": 1 } },
    { $replaceRoot : {newRoot: "$doc"} },
    { $addFields : { Tags : [ "$Tags" ] } } // we unwinded all tags, so we convert this field back to an array, otherwise we can get type error
])

它比原始代码快10倍:在我的机器上3秒对31秒。

示例输入

{
    "_id" : ObjectId("53396223ec8bd02674b1208c"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.834Z"),
    "Tags" : [ 
        "cars", 
        " car_diler", 
        " autodiler", 
        " auto", 
        " audi", 
        ":image"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b1208d"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        ":image"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b1208e"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "cars", 
        ":image"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b1208f"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "something",
        ":image",
        ":somethingelse"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b120ff"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "something",
        ":somethingelse"
    ]
}

当前输出:

{
    "_id" : ObjectId("53396223ec8bd02674b1208e"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "cars"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b1208f"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "something"
    ]
}

期望的输出:

{
    "_id" : ObjectId("53396223ec8bd02674b1208e"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "cars", 
        ":image"
    ]
},
{
    "_id" : ObjectId("53396223ec8bd02674b1208f"),
    "UploadDate" : ISODate("2014-03-31T12:40:03.835Z"),
    "Tags" : [ 
        "something",
        ":image",
        ":somethingelse"
    ]
}

如您所见,我在这里放开了以:开头的所有标签。在我的情况下它已经足够好了,但对其他人来说这可能很重要。我可以先收集IDs然后查询它们,但在一个查询中执行所有操作至关重要。

答案 1 :(得分:1)

这是一个更简洁的版本,不需要任何unwind

db.myCollection.aggregate([
{
    $addFields: { // we want to add new field...
        "NumberOfTagsWithoutSemiColon": {
            $size: { // ...that shall contain the number...
                $filter: {
                    input: "$Tags", // ...of all tags...
                    cond: {
                        $eq: // ...that do not contain a semicolon
                        [
                            { $indexOfBytes: [ "$$this", ":" ] },
                            -1
                        ]
                    }
                }
            }
        }
    }
}, {
    $match: {
        "NumberOfTagsWithoutSemiColon": 1 // we only keep the ones where 
    }
}, {
    $project: {
        "NumberOfTagsWithoutSemiColon": 0
    }
}])