mongodb map reduce for search criteria

时间:2016-11-23 01:21:26

标签: mongodb algorithm mapreduce

我有mongo文档,其中包含一个名为searchTerms的字段。这是一个包含单词的数组,例如。 [“term1”,“term2”,“term3”,“term4”]

我想编写一个按相关性返回文档的函数。这意味着searchCriteria中搜索条件最多的文档首先遵循下一个最高金额条款等。

示例:

文件:

   {"_id":"1", "searchTerms":["a","b","c","d"]}
   {"_id":"2", "searchTerms":["a","b","x","q"]}
   {"_id":"3", "searchTerms":["a","e","x","n"]}
   {"_id":"4", "searchTerms":["e","f","g","z"]}

对于搜索词:[“a”,“b”,“c”],结果应为:

{"_id":"1", "searchTerms":["a","b","c","d"]}
{"_id":"2", "searchTerms":["a","b","x","q"]}
{"_id":"3", "searchTerms":["a","e","x","n"]}

我已经编写了一个函数来执行此操作,但它非常复杂,我认为效率低下。我正在阅读有关地图减少的内容并想知道它是否可以在这种情况下提供帮助?我绞尽脑汁试着弄清楚如何做到这一点。我不确定它是否可以?如果是,有人可以告诉我它是如何工作的吗?

1 个答案:

答案 0 :(得分:1)

一个简单的集合运算符就足够了。使用$ setIntersection与输入数组进行比较,使用$ project $ size的相交数组。 $ sort on size descending并预测最终响应。

aggregate([{
    "$project": {
        "_id":0,
        "fields" : "$$ROOT",
        "matches": {
            "$size": {
                "$setIntersection": [
                    "$searchTerms", ["a", "b"]
                ]
            }
        }
    }
}, {
    "$sort": {
        "matches": -1
    }
}])