这是我第一次使用MongoDB聚合查询。我的数据集如下:
{ // doc 1
"_id" : ObjectId("55f2481bc9b4cd1c0c198c9f"),
"channels" : [
"channel_3",
"channel_2",
"channel_1",
"channel_4"
],
"msd" : 25,
"uid" : "000012bb-2e5a-8bd3-d36a-fa037973e632"
}
{ // doc 2
"_id" : ObjectId("55f2481bc9b4cd123452345f"),
"channels" : [
"channel_3",
"channel_4"
],
"msd" : 50,
"uid" : "000012bb-2e5a-8bd3-d36a-fa037973e632"
}
{ // doc 3
"_id" : ObjectId("55f2481bc9b4cd1c0c198c9f"),
"channels" : [
"channel_2"
],
"msd" : 100,
"uid" : "000012bb-2e5a-8bd3-d36a-fa037973e632"
}
{ // doc 4
"_id" : ObjectId("55f2481bc9b4cd1c0c198c9f"),
"channels" : [
"channel_2"
],
"msd" : 80,
"uid" : "000012bb-2e5a-8bd3-d36a-fa037973e632"
}
我已经构建了一个复合索引:
userlog.create_index([('uid', ASCENDING), ('channels', ASCENDING)])
现在,给定一个用户和一个频道数组,我想检索其中至少有一个频道位于查询频道中的msd的平均值。 例如,查询是:
{"uid" : "000012bb-2e5a-8bd3-d36a-fa037973e632", "channels" : ["channel_1", "channel_2"], }
doc 1的频道包含“channel_1”和“channel_2”,doc 3和4的频道包含“channels_2”。所以预期的回报值是(25 + 100 + 80)/ 3 = 68.33
======================试用1 ==================== ===
CODE:
pipe=[
{"$unwind":'$channels'},
{"$match":{'uid':"000012bb-2e5a-8bd3-d36a-fa037973e632", 'channels':{'$in':channels}}},
{"$group":{'_id': '$channels', 'averageMSD':{'$avg':'$msd'}}}
]
for res in db.aggregate(pipeline=pipe):
print(res)
结果:
{'_id': 'channel_1', 'averageMSD': 25.0}
{'_id': 'channel_2', 'averageMSD': 68.33333333333333}
似乎“$ unwind”使得doc 1意外地被计算两次。另外,“$ unwind”非常慢。
======================试用2 ==================== ===
CODE:
pipe=[
{"$match":{'uid':"000012bb-2e5a-8bd3-d36a-fa037973e632", 'channels':{'$in':channels}}},
{"$group":{'_id': '$channels', 'averageMSD':{'$avg':'$msd'}}}
]
for res in db.aggregate(pipeline=pipe):
print(res)
结果:
{'averageMSD': 90.0, '_id': ['channel_2']}
{'averageMSD': 25.0, '_id': ['channel_3', 'channel_2', 'channel_1', 'channel_4']}
结果仍然不是我想要的。似乎我不应该通过“渠道”对结果进行分组。但我不知道如何解决它。
如何使用聚合有效地查询数据库?
答案 0 :(得分:0)
我明白了:
pipe=[
{"$match":{'uid':"000012bb-2e5a-8bd3-d36a-fa037973e632", 'channels':{'$in':channels}}},
{"$group":{'_id': None, 'averageMSD':{'$avg':'$msd'}}}
]