如何在聚合中进行分组,还可以使用Mongodb显示其他字段?

时间:2014-01-04 00:21:24

标签: mongodb

我需要两次运行组才能在评论中找到平均次数最高的帖子。以下是我查询的初始阶段。

db.posts.aggregate([
    {"$unwind": "$comments"},
    {"$match":
        {
            "$comments.type": {
                "$ne" : "spam"
            },
        }
    }
])

这是我在运行上述查询后看到的内容。

    {
        "_id" : ObjectId("50b59cd75bed76f46522c465"),
        "comment_id" : 49,
        "post_id" : 29,
        "likes" : {
            "type" : "accepted",
            "like" : 3
        }
    },
    {
        "_id" : ObjectId("50b59cd75bed76f46522c465"),
        "comment_id" : 49,
        "post_id" : 29,
        "likes" : {
            "type" : "rejected",
            "like" : 7
        }
    }

我现在要做的是找到平均喜欢特定评论首先从这些有效记录中获取,然后在每个帖子中,将每个评论的所有这些平均评价总结,然后除以每个帖子的评论总数。

请注意,comment_id仅在同一post_id中唯一。意思是说,有些记录是post_id 28,comment_id 49。

我尝试了这个查询。

db.posts.aggregate([
    {"$unwind": "$comments"},
    {"$match":
        {
            "$comments.type": {
                "$ne" : "spam"
            },
        }
    },
    {"$group" :
        {
            "_id": "$_id",
            "comment_avg":
            {
                "$avg":"$comments.like"
            }
        }
    }])

我回过头来看:

{
            "_id" : ObjectId("50b59cd75bed76f46522c44d"),
            "comment_avg" : 61.074253191058865
        },
        {
            "_id" : ObjectId("50b59cd75bed76f46522c34e"),
            "comment_avg" : 46.82622896256565
        }

正如您所看到的,我丢失了post_id信息。我尝试过$ project,但我认为我一定是这么做错了。

1 个答案:

答案 0 :(得分:1)

您尚未发布初始文档结构。

Document Structure:

{
    "_id" : ObjectId("50b59cd75bed76f46522c471"),
    "comment_id" : 61,
    "post_id" : 29,
    "comments" : [
                   {
                       "type" : "accepted",
                       "like" : 3
                   },
                   {
                      "type" : "rejected",
                      "like" : 3
                   },
                   {
                      "type" : "spam",
                      "like" : 3
                   }
                ]
}

假设您的文档结构如上所述,我已经编写了这个查询。你必须根据自己的需要操纵它。

db.posts.aggregate([
        {$unwind:"$comments"},
        {$match:{"$comments.type":{$ne:"spam"}}},
        {$group:{_id:{post_id:"$post_id",comment_id:"$comment_id"},LikeSum:{$sum:"$comments.like"}}},
        {$group:{_id:{post_id:"$_id.post_id"},AvgComments:{$avg:"$LikeSum"}}},
        {$sort:{AvgComments:-1}},
        {$limit:1}
              ])

上述查询构造如下:

1.) Unwind the comments array and form individual documents for each element in the comments array
2.) Select only the non-spam comments
3.) Calculate the sum of likes for each comment of all posts
4.) Calculate the average Comment likes for each post
5.) Sort documents in descending order of Average Comment Likes
6.) Select only the first document.

输出文档类似于

{
    "result" : [
        {
            "_id" : {
                       "post_id" : xx
                    },
            "AvgComments" : xx.xx // Avg Comment likes for post xx
        }
               ],
    "ok" : 1
}