我的数据集看起来像
{"BrandId":"a","SessionId":100,"UserName":"tom"}
{"BrandId":"a","SessionId":200,"UserName":"tom"}
{"BrandId":"b","SessionId":300,"UserName":"mike"}
我想用brandid计算不同的会话和用户名组,示例sql就像:
select brandid,count_distinct(sessionid),count_distinct(username)
from data
group by brandid
我尝试编写Mongo DB,我当前的代码如下,它不起作用。反正有没有让它发挥作用?
db.logs.aggregate([
{$group:{
_id:{brand:"$BrandId",user:"$UserName",session:"$SessionId"},
count:{$sum:1}}},
{$group:{
_id:"$_id.brand",
users:{$sum:"$_id.user"},
sessions:{$sum:"$_id.session"}
}}
])
对于某个例子,预期的计数是
{"BrandId:"a","countSession":2,"countUser":1}
{"BrandId:"b","countSession":1,"countUser":1}
如果您了解SQL,则期望结果与我提到的SQL相同。
答案 0 :(得分:3)
您可以使用$addToSet
在SessionId
期间累积不同的UserName
和$group
值,然后添加$project
阶段来执行此操作到使用$size
运算符获取每个集合大小的管道:
db.logs.aggregate([
{$group: {
_id: '$BrandId',
sessionIds: {$addToSet: '$SessionId'},
userNames: {$addToSet: '$UserName'}
}},
{$project: {
_id: 0,
BrandId: '$_id',
countSession: {$size: '$sessionIds'},
countUser: {$size: '$userNames'}
}}
])
结果:
{
"BrandId" : "b",
"countSession" : 1,
"countUser" : 1
},
{
"BrandId" : "a",
"countSession" : 2,
"countUser" : 1
}