需要Mongodb Aggregate输出格式的帮助。
我的数据输入包含如下内容:
{'parent_id': '133', 'status_id': '209101162445115_1199071210114767', 'author_id': '10209422198664172', 'comment_published': '2016-08-15 08:57:09'}
我需要计算author_id的出现次数,给定匹配的parent_id。我用聚合做到了:
m = collection.aggregate([{"$match": {"parent_id":"437325203079413_1543639"}},
{ "$group": {"_id": {"author_id": "$author_id"}, "count":{"$sum":1}}},
{"$project": {"_id":1, "count":1}} ]) #this line does not make any difference in the output.
page =[]
for i in m:
page.append(i)
print(page)
输出如下:
[{'_id': {'author_id': '10155430875324466'}, 'count': 1},
{'_id':{'author_id': '1249853341715138'}, 'count': 2},
{'_id': {'author_id': '10153804689530108'}, 'count': 1}]
我希望输出采用以下格式:
[{'author_id': '10155430875324466', 'count': 1},
{'author_id': '1249853341715138', 'count': 2},
{'author_id': '10153804689530108', 'count': 1}]
或者这个:
[{'10155430875324466', 1},
{'1249853341715138', : 2},
{'10153804689530108', 1}]
我知道在python中这样做很慢,但我觉得应该有更好的解决方案。是否有可能在聚合查询本身内完成?任何人都可以建议吗?
答案 0 :(得分:0)
你可以试试这个。您可以直接使用author_id
作为分组_id
,然后project
将_id
中的值author_id
用作最后阶段的db.collection.aggregate([
{ "$match" : { "parent_id" : "437325203079413_1543639" } },
{ "$group" : { "_id" : "$author_id", "count": { "$sum" : 1 } } },
{ "$project" : { "_id" : 0, "author_id" : "$_id", "count" : 1 } }
]);
。
$project
或者您可以更改最终db.collection.aggregate([
{ "$match" : { "parent_id" : "437325203079413_1543639" } },
{ "$group" : { "_id" : { "author_id": "$author_id"}, "count": { "$sum" : 1 } } },
{ "$project" : { "_id" : 0, "author_id" : "$_id.author_id", "count":1 } }
]);
阶段,如下所示。
~/.gitconfig.local