Question

我有一个集合'评论'如下：

{
comment_id:10001
aspects:[
 {
   name:'aspectA',
   positive:2
   negative:3
   neutral:1
  },
 {
   name:'aspectB',
   positive:1
   negative:5
   neutral:3
  }
},
{
comment_id:10002
aspects:
 {
   name:'aspectA',
   positive:2
   negative:1
   neutral:2
  },
 {
   name:'aspectB',
   positive:3
   negative:4
   neutral:1
  }
}
]

评论中的文件数量大于100K。我必须找到所有方面的正面，负面和中立的数量，即。所有文档中aspects字段（如上所述的字典列表）中每个方面的正面，负面和中性的总和。我发现mapreduce可以用来完成任务，但我找不到足够的文档来构建查询。

有没有办法使用一个查询找到它？

Answer 1

要按aspects.name求和，您可以使用以下聚合：

db.comments.aggregate([{
    $unwind: "$aspects"
}, {
    $group: {
        _id: "$aspects.name",
        "positive": { $sum: "$aspects.positive" },
        "negative": { $sum: "$aspects.negative" },
        "neutral": { $sum: "$aspects.neutral" }
    }
}])

使用pymongo：

from pymongo import MongoClient
import pprint

client = MongoClient('localhost', 27017)

db = client.testDB

pipeline = [
    {"$unwind": "$aspects"},
    {"$group": {
        "_id": "$aspects.name", 
        "positive": { "$sum": "$aspects.positive" }, 
        "negative": { "$sum": "$aspects.negative" }, 
        "neutral": { "$sum": "$aspects.neutral" }
        }
    }
]

pprint.pprint(list(db.comments.aggregate(pipeline)))

Mongo Aggregate所有文档的词典列表中的值的总和

1 个答案: