MongoDB查询数组并汇总关键字

时间:2016-10-19 07:11:17

标签: arrays mongodb

我想查询数组的子键,提取关键字并计算它们。

我想提取的词汇在“标题”中。当我应用查询过滤器时,它显示我需要查询的字段是Reviews.0.Title。但我在Reviews数组中至少有200个元素。

我该怎么做?

{ 
    "_id" : ObjectId("561c3ccc4c97f053753f1a78"), 
    "Reviews" : 
    [
        {
        "Ratings" : {
                 "Service" : "4", 
                 "Overall" : "5"
                    }, 
        "Location" : "MIS", 
        "Title" : "“Excellent and great”", 
        "Author" : "JDoe", 
        "ReviewID" : "1", 
        "Date" : "March 30, 2015"
        }, 
   {
    "Ratings" : {
                  "Service" : "4", 
                  "Overall" : "5"  
                 }, 
    "Location" : "WIS", 
    "Title" : "“Excellent and fantastic!”", 
    "Author" : "John Doe", 
    "ReviewID" : "2",  
    "Date" : "March 27, 2016"
   }
    ],

    "Info" : 
    {
    "Name" : "AA",
    "ID" : "0001"
     }
}

{ 
    "_id" : ObjectId("561c3ccc4c97f0ytu7289074"), 
    "Reviews" : 
    [
        {
         "Ratings" : {
                 "Service" : "4", 
                 "Overall" : "5"
                    }, 
         "Location" : "VEG", 
         "Title" : "“Not too bad”", 
         "Author" : "JDoe", 
         "ReviewID" : "3", 
         "Date" : "March 30, 2015"
        }, 
       {
        "Ratings" : {
                  "Service" : "4", 
                  "Overall" : "5"  
                 }, 
        "Location" : "NEV", 
        "Title" : "“Outstanding service”", 
        "Author" : "John Doe", 
        "ReviewID" : "4",  
        "Date" : "March 27, 2016"
       }
    ],

    "Info" : 
    {
     "Name" : "BB",
     "ID" : "0002"
    }

}

我想得到以下输出:

{ "_id" : "Excellent", "value" : 1 }
{ "_id" : "Great", "value" : 1 }
{ "_id" : "Location", "value" : 2 }

编辑了名称为

的输出
{ "Name" : AA: "Excellent", "value" : 1 "Great", "value" : 1 }
{ "Name" : BB: "Great", "value" : 1 }

1 个答案:

答案 0 :(得分:0)

我设法找到一个mapReduce解决方案,提供类似于您所寻找的功能here

我调整了该解决方案并生成了以下mapReduce查询:

db.reviews.mapReduce(
// Map function
function () {
    if(this.Reviews) {
        this.Reviews.forEach(function(review) {
            if(review["Title"]) {
                // Remove all punctuation from the review title
                var titleNoPunctuation = review["Title"].replace(/[^\w\s?.]/g, "");

                // Transform all the review titles to lower case and split them into word tokens
                var titleTokens = titleNoPunctuation.toLowerCase().split(" ");

                titleTokens.forEach(function(word) {
                    emit(word, 1);
                });

            }
        });
    }
},

// Reduce function
function (key, values) {
    var counter = 0;

    values.forEach(function(value) {
        counter += value;
    });

    return counter;
},

// Output collection 
{ out: "word_count"});

上述查询将其结果存储在out属性中指定的不同集合中,在此示例中为word_count

因此,为了获取mapReduce作业返回的结果集,您需要在out集合(word_count)上运行以下查询:

> db.word_count.find();
{ "_id" : "excellent", "value" : 1 }
{ "_id" : "great", "value" : 1 }
{ "_id" : "location", "value" : 2 }