mongodb $ unwind空数组

时间:2016-08-09 09:33:22

标签: mongodb aggregation-framework

有了这些数据:

{
    "_id" : ObjectId("576948b4999274493425c08a"),
    "virustotal" : {
        "scan_id" : "4a6c3dfc6677a87aee84f4b629303c40bb9e1dda283a67236e49979f96864078-1465973544",
        "sha1" : "fd177b8c50b457dbec7cba56aeb10e9e38ebf72f",
        "resource" : "4a6c3dfc6677a87aee84f4b629303c40bb9e1dda283a67236e49979f96864078",
        "response_code" : 1,
        "scan_date" : "2016-06-15 06:52:24",
        "results" : [ 
            {
                "sig" : "Gen:Variant.Mikey.29601",
                "vendor" : "MicroWorld-eScan"
            }, 
            {
                "sig" : null,
                "vendor" : "nProtect"
            }, 
            {
                "sig" : null,
                "vendor" : "CAT-QuickHeal"
            }, 
            {
                "sig" : "HEUR/QVM07.1.0000.Malware.Gen",
                "vendor" : "Qihoo-360"
            }
        ]
    }
},
{
    "_id" : ObjectId("5768f214999274362f714e8b"),
    "virustotal" : {
        "scan_id" : "3d283314da4f99f1a0b59af7dc1024df42c3139fd6d4d4fb4015524002b38391-1466529838",
        "sha1" : "fb865b8f0227e9097321182324c959106fcd8c27",
        "resource" : "3d283314da4f99f1a0b59af7dc1024df42c3139fd6d4d4fb4015524002b38391",
        "response_code" : 1,
        "scan_date" : "2016-06-21 17:23:58",
        "results" : [ 
            {
                "sig" : null,
                "vendor" : "Bkav"
            }, 
            {
                "sig" : null,
                "vendor" : "ahnlab"
            }, 
            {
                "sig" : null,
                "vendor" : "MicroWorld-eScan"
            }, 
            {
                "sig" : "Mal/DrodZp-A",
                "vendor" : "Qihoo-360"
            }
        ]
    }
}

我正在尝试分组并在sig不为空时对供应商进行计数以获得类似的内容:

{
    "_id" : "Qihoo-360",
    "count" : 2
},
{
    "_id" : "MicroWorld-eScan",
    "count" : 1
},
{
    "_id" : "Bkav",
    "count" : 0
},
{
    "_id" : "CAT-QuickHeal",
    "count" : 0
}

目前使用此代码:

db.analysis.aggregate([ 
    { $unwind: "$virustotal.results"  },
    {
        $group : {
             _id : "$virustotal.results.vendor", 
             count : { $sum : 1 }
        }
    },
    { $sort : { count : -1 } }
])

我得到了一切:

{
    "_id" : "Qihoo-360",
    "count" : 2
},
{
    "_id" : "MicroWorld-eScan",
    "count" : 2
},
{
    "_id" : "Bkav",
    "count" : 1
},
{
    "_id" : "CAT-QuickHeal",
    "count" : 1
}

如果sig为空,我如何计算0?

2 个答案:

答案 0 :(得分:1)

您需要在 $sum 运算符中使用条件表达式,该运算符将使用比较运算符 $gt <检查"$virustotal.results.sig"键是否为空/ strong>(如documentation's BSON comparsion order中所述)

您可以通过添加以下表达式来重构您的管道:

db.analysis.aggregate([
    { "$unwind": "$virustotal.results" },
    {
        "$group" : {
            "_id": "$virustotal.results.vendor", 
            "count" : { 
                "$sum": {
                    "$cond": [
                        { "$gt": [ "$virustotal.results.sig", null ] },
                        1, 0
                    ]
                }
            }
        }
    },
    { "$sort" : { "count" : -1 } }
])

示例输出

/* 1 */
{
    "_id" : "Qihoo-360",
    "count" : 2
}

/* 2 */
{
    "_id" : "MicroWorld-eScan",
    "count" : 1
}

/* 3 */
{
    "_id" : "Bkav",
    "count" : 0
}

/* 4 */
{
    "_id" : "CAT-QuickHeal",
    "count" : 0
}

/* 5 */
{
    "_id" : "nProtect",
    "count" : 0
}

/* 6 */
{
    "_id" : "ahnlab",
    "count" : 0
}

答案 1 :(得分:0)

我用null更改了null并且数字增加了但似乎还不正确。 基本上在mongoshell中进行查询我得像

{     &#34; _id&#34; :&#34;卡巴斯基&#34;,     &#34;计数&#34; :176.0     }

来自python的

: 卡巴斯基64

其中一个是错误的:)

所以我试图调查python中查询的哪个部分与mongo shell相比没有正确编写。 我做了一个简单的查询: 在mongoshell:         rtmp = results_db.analysis.count({&#34; virustotal.results&#34;:{&#34; $ elemMatch&#34;:{&#34; vendor&#34;:&#34; Kaspersky&#34;, &#34; sig&#34;:{&#34; $ ne&#34;:&#34; null&#34;}}}}}) 结果:176

db.analysis.count({&#34; virustotal.results&#34;:{$ elemMatch:{&#34; vendor&#34;:&#34; Kaspersky&#34;,&#34; sig&# 34;:{$ gt:null}}}}) 结果:0

然后我在python中尝试过:     rtmp = results_db.analysis.count({&#34; virustotal.results&#34;:{&#34; $ elemMatch&#34;:{&#34; vendor&#34;:&#34; Kaspersky&#34;, &#34; sig&#34;:{&#34; $ ne&#34;:&#34; null&#34;}}}}}) 结果:568

rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$ne": "None"} } }})

结果:568

rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$gt": "None"} } }})

结果:64

rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$gt": "null"} } }})

结果:6

很难说什么是正确的价值!我想176但不能在python中重现...