聚合以获取数组中键值对的总数,对名称的字段值进行分组

时间:2015-07-27 05:15:15

标签: mongodb mongodb-query aggregation-framework

我的文档结构如下,每个数组元素包含" k"和" v"作为不同类型数据的关键和价值。我需要通过" k"值"设施"," ip"和" num"组合并计算集合中的完全不同组合。

{ 
    "_id" : 1, 
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.151.104" },
        { "n" : "protocol", "v" : "55902/udp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 2, 
    "logs" : [ 
        { "n" : "facility", "v" : 26 }, 
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13438/tcp" }, 
        { "n" : "port", "v" : "13438" } 
    ]
},
{ 
    "_id" : 3,
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13434/tcp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 4,
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.184" },
        { "n" : "protocol", "v" : "61662/udp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 5, 
    "logs" : [ 
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13435/tcp" }, 
        { "n" : "port", "v" : "13435" } 
    ]
},
{ 
    "_id" : 6,
    "logs" : [ 
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.160.51" },
        { "n" : "protocol", "v" : "61662/udp" },
        { "n" : "port", "v" : "53" }
    ]

}

查询选择条件我不是:

  1. port is 53
  2. 协议是' udp'或者&#t; tcp'
  3. 按[设施,数量,IP]分组
  4. 那应该选择六个文件中的四个。那部分正在运作

    我想要这样的结果。

    {facility : 26, num : 6, ip : 137.68.151.104 , count : 1}
    {facility : 26, num : 6, ip : 137.68.160.51 , count : 2}
    {facility : 26, num : 6, ip : 137.68.160:184 , count : 1}
    

    这是我到目前为止所做的:

    db.agg.aggregate ([
    {
    '$match' : { 'logs' : {'$all' : [{'$elemMatch' : {'n' : "port", "v" : "53"}}, {'$elemMatch' : {'n' : "protocol", "v" : {"$in" :[/udp/,/tcp/]}}}   ]}}     },
    { '$unwind' : '$logs' },
    { '$match' : {"logs.n" : "ip"}},
    { '$group' : { _id : { 'ip' : '$logs.v'}, count : {$sum : 1}}}
    ])
    

    但我不知道如何获得那里的所有字段,而且我目前只得到" ip"

    的结果。

2 个答案:

答案 0 :(得分:1)

请检查以下内容:

db.exp.aggregate([
 { $match : { logs : {"$all" : [{"$elemMatch" : 
           {"n" : "port", "v" : "53"}
 }, 
 { "$elemMatch" : {"n" : "protocol", "v" : {"$in" :[/udp/,/tcp/]}}}]}}
 },
 { $unwind: "$logs"},
 { $project: { facility : 
                { $cond:
                  { if :{ $eq: [ "$logs.n", "facility" ] }, 
                    then : "$logs.v", else : null}} , 
                      num : {$cond:{if : { $eq:  [ "$logs.n", "num" ] }, 
                    then : "$logs.v", else : null}}, 
                      ip : {$cond:{if : { $eq: [ "$logs.n", "ip" ] }, 
                    then : "$logs.v", else : null}} } },
 { $group: {_id:"$_id" , facility : {"$max" : "$facility"},
           num : {"$max": "$num"} , ip : {"$max" : "$ip"}}
 },
 { $group : {_id: {facility :"$facility" , 
           num : "$num" , ip : "$ip"} , count : {"$sum":1}}
 }
    ]);

上述查询将获取所需的结果:

{ "_id" : { "facility" :26, "num" : 6,
    "ip" : "137.68.151.104" }, "count" : 1 
}
{ "_id" : { "facility" : 26, "num" : 6,
    "ip" : "137.68.160.51" }, "count" : 2 
}
{ "_id" : { "facility" : 26, "num" : 6,
    "ip" : "137.68.160.184" }, "count" : 1 
}

答案 1 :(得分:-2)

$unwind之后尝试匹配时,您的逻辑出错了。由于项不再在数组中,因此您需要将所需的所有键值与字段匹配。

然后,您可以通过$cond运算符和一些广告素材分组将它们转换为字段:

db.agg.aggregate([
    { "$match": {
       "logs" : {
           "$all": [
               { "$elemMatch": { "n": "port", "v": "53" } },
               { "$elemMatch": { "n": "protocol", "v": { "$in" :[/udp/,/tcp/] } } }
           ]
       }
    }},
    { "$unwind": "$logs" },
    { "$match": { "logs.n": { "$in": ["ip","facility","num"] } } },
    { "$group": {
        "_id": "$_id",
        "facility": {
            "$min": {
                "$cond": [
                    { "$eq": [ "$logs.n", "facility" ] },
                    "$logs.v",
                    false
                ]
            }
        },
        "ip": {
            "$min": {
                "$cond": [
                    { "$eq": [ "$logs.n", "ip" ] },
                    "$logs.v",
                    false
                ]
            }
        },
        "num": {
            "$min": {
                "$cond": [
                    { "$eq": [ "$logs.n", "num" ] },
                    "$logs.v",
                    false
                ]
            }
        }
    }},
    { "$group": {
       "_id": {
           "facility": "$facility",
           "ip": "$ip",
           "num": "$num"
       },
       "count": { "$sum": 1 }
    }}
 ])

$min累加器用于指示false值,只留下“字段”的所需值。

结果如下:

{ "_id" : { "facility" : 26, "ip" : "137.68.151.104", "num" : 6 }, "count" : 1 }
{ "_id" : { "facility" : 26, "ip" : "137.68.160.184", "num" : 6 }, "count" : 1 }
{ "_id" : { "facility" : 26, "ip" : "137.68.160.51", "num" : 6 }, "count" : 2 }