我正在计算以下类型的MongoDB文档中相似值的数量。
{
"_id" : ObjectId("5b7fa2a745d12b10324d3a01"),
"date_time" : "24, Aug 2018 11:44",
"original_query" : "sanjeevnewar",
"posts" : [
{
"full_story" : "https://m.facebook.com//story.php?story_fbid=10156903788391742&id=590746741&refid=17",
"love" : {
"total" : "1",
"reactors" : [
{
"profile" : "https://m.facebook.com//amrit2100",
"name" : "Amrit Kumar Deb",
"fb_id" : "100002204201460"
}
]
},
"like" : {
"total" : "65",
"reactors" : [
{
"profile" : "https://m.facebook.com//profile.php?id=100026633915502",
"name" : "Kamlesh Yadav",
"fb_id" : "100026633915502"
},
}
]
},
这对应于针对不同fb_id的多个条目。
我的实现是-
result = db.post.find()
list_of_reactor_ids = []
我已经使用Counter来累积更新值
dict_of_reactor_ids = Counter()
sorted_dict_of_reactor_ids = {}
okey={}
for doc in result:
for post in doc['posts']:
if 'like' in post:
value = 1
我将相应的fb_id附加其值,然后对其进行排序
for reactor in post['like']['reactors']:
dict_of_reactor_ids.update({reactor['fb_id']: value})
sorted_dict_of_reactor_ids = sorted(dict_of_reactor_ids.items(),
key=lambda dict_of_reactor_ids: dict_of_reactor_ids[1], reverse=True)
if 'fb_id' in doc:
okey.update({doc['fb_id']: sorted_dict_of_reactor_ids})
pprint.pprint(okey)
我正在获取输出,但是键值对并不对应于实际的键值对,有些是实际的,而有些来自其他键值条目,例如以下输出中的第二个值实际上对应于其他键,但是出现在第一个条目中。 我得到的输出摘录-
{'100000107208292': [('100013453489456', 16),
('100009112778707', 14),
('100011123838353', 14),
'100000990054613': [('100013633776676', 7),
('100010228383344', 7),
('100010068034206', 7),
('100013210631871', 6),
('100014734173448', 5),
('100011640376115', 4)
]
}