我有这样的集合。大量数据,~8 GB
> db.collector.find({},{'first':1,srcport: true,dstport:true,dOctets:true,_id:0}).skip(1682000)
{ "dstport" : 34760, "dOctets" : 104, "first" : NumberLong("1476941688344"), "srcport" : 443 }
{ "dstport" : 443, "dOctets" : 104, "first" : NumberLong("1476941689944"), "srcport" : 59326 }
{ "dstport" : 59326, "dOctets" : 104, "first" : NumberLong("1476941690034"), "srcport" : 443 }
{ "dstport" : 5222, "dOctets" : 164, "first" : NumberLong("1476941698934"), "srcport" : 58918 }
{ "dstport" : 443, "dOctets" : 92, "first" : NumberLong("1476941698974"), "srcport" : 42704 }
{ "dstport" : 443, "dOctets" : 116, "first" : NumberLong("1476941698974"), "srcport" : 34716 }
{ "dstport" : 34716, "dOctets" : 104, "first" : NumberLong("1476941698984"), "srcport" : 443 }
{ "dstport" : 42704, "dOctets" : 80, "first" : NumberLong("1476941698984"), "srcport" : 443 }
{ "dstport" : 58918, "dOctets" : 104, "first" : NumberLong("1476941699024"), "srcport" : 5222 }
{ "dstport" : 123, "dOctets" : 152, "first" : NumberLong("1476941699244"), "srcport" : 123 }
{ "dstport" : 123, "dOctets" : 152, "first" : NumberLong("1476941699294"), "srcport" : 123 }
{ "dstport" : 54526, "dOctets" : 394, "first" : NumberLong("1476941700394"), "srcport" : 3389 }
{ "dstport" : 3389, "dOctets" : 104, "first" : NumberLong("1476941700394"), "srcport" : 54526 }
{ "dstport" : 123, "dOctets" : 152, "first" : NumberLong("1476941701254"), "srcport" : 123 }
{ "dstport" : 5678, "dOctets" : 402, "first" : NumberLong("1476941703414"), "srcport" : 39926 }
{ "dstport" : 5678, "dOctets" : 268, "first" : NumberLong("1476941703414"), "srcport" : 39926 }
{ "dstport" : 5678, "dOctets" : 399, "first" : NumberLong("1476941703414"), "srcport" : 46336 }
{ "dstport" : 5678, "dOctets" : 266, "first" : NumberLong("1476941703414"), "srcport" : 46336 }
{ "dstport" : 5678, "dOctets" : 381, "first" : NumberLong("1476941703414"), "srcport" : 46575 }
{ "dstport" : 5678, "dOctets" : 387, "first" : NumberLong("1476941703414"), "srcport" : 46845 }
我想做最重要的统计数据。
0)匹配
{'$match': {
first: {
'$gte':startdate,
'$lte':stopdate},
}}
1)分组dstport
和总和dOctets
'$group': {_id: { port:"$dstport"...
2)分组srcport
和总和dOctets
'$group': {_id: { port:"$srcport"...
3)加入1,2组
4)分组_id.port
和总和
5)排序和限制
我想要的结果必须看起来像
[{port:443, inOctets:123456, outOctets:321654, sum: 445110}...
我尝试使用聚合管道,但没有办法分叉两组。
我可以在没有临时收集的情况下完成这项工作吗?
答案 0 :(得分:1)
MongoDB 3.4支持$ facet,这意味着可以创建多个空间,您可以在其中计算隔离查询(例如2种类型的组)。
提供处理输入文档上多个管道的功能,并输出包含这些管道结果的文档
https://docs.mongodb.com/master/release-notes/3.4-reference/#pipe._S_facet