Question

我有一个像这样的mongodb集合：

{"uid": "01370mask4",
 "title": "hidden",
 "post: "hidden",
 "postTime": "01-23, 2016",
 "unixPostTime": "1453538601",
 "upvote": [2, 3]}

我想从超过5个帖子的用户中选择帖子记录。结构应该是相同的，我只是不需要没有很多帖子的用户的文件。

db.collection.aggregate(
   [
     { $group : { _id : "$uid", count: { $sum: 1 } } }
   ]
)

现在我一直坚持如何使用计数值来查找。我搜索但没有找到任何方法将计数值添加回uid的同一个集合。 mongodb似乎不支持保存聚合输出并将它们连接在一起。请指教，谢谢！

更新

很抱歉我之前没有说清楚。谢谢你的快速回答！我想要原始集合的子集，包括帖子文本，帖子时间戳等。我不想要聚合输出的子集。

Answer 1

如果没有数百万个文档，那么您可以尝试使用快捷方式来实现使用一个聚合和另一个查找查询所尝试的内容，

汇总查询：

var users = db.collection.aggregate(
  [
    {$group:{_id:'$uid', count:{$sum:1}}},
    {$match:{count:{$gt:5}}},
    {$group:{_id:null,users:{$push:'$_id'}}}
  ]
).toArray()[0]['users']

然后，它是一个直接查询以查找特定用户：

db.collection.find({uid: {$in: users}})

Answer 2

只需在您的论坛后面添加$match并使用正确的查询即可：

db.collection.aggregate(
  [
    { $group : { _id : "$uid", count: { $sum: 1 } } },
    { $match : { count : { $gt : 5 } }
  ]
)

Answer 3

请尝试使用此选项来选择超过5个帖子的用户。要使用$first保留原始字段，如果$uid是唯一的，请添加以下字段。

db.collection.aggregate([
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}])
)

如果同一$uid有多个值，则应$push使用$group中的数组。

如果您想将结果保存到db，请按以下方式尝试

var cur = db.collection.aggregate(
   [
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}
   ]
)
cur.forEach(function(doc) {
   db.collectioin.update({_id: doc._id}, {/*the field should be updated */});
});

mongodb根据计数聚合查找匹配项

3 个答案: