MongoDB将结果聚合到嵌套数组

时间:2017-04-09 18:09:25

标签: mongodb mongodb-query

我是MongoDB的新手,我目前面临着一种情况。以下是我拥有的整个数据库的2个样本记录:

{
    "_id": 1,
    "Record": 1,
    "Link": [ "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html" ],
    "Location": [ "USA", "PAN", "USA", "USA", "PAN" ],
    "Organization": [ "GN", "SOUTHCOM", "UCMJ", "PRC" ],
    "Date": [ "2016" ],
    "People": [ "P.Walter" ]
}
{
    "_id": 2,
    "Record": 2,
    "Link": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html" ],
    "Location": [ "NIC", "GTM", "JAM", "GTM", "PAN" ],
    "Organization": [ "CENTAM", "Calibre Mining Corporation", "STRATFOR", "Alder Resources" ],
    "Date": [ "2013" ],
    "People": [ "Daniel Ortega", "Hugo Chavez", "Paulo Gregoire" ]
}

基本上,我正在尝试获得这样的输出:

{
    "Country": "US",
    "Years": [
        {
            "Year": "2016",
            "Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
             "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
             "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
        },
        {
            "Year": "2013",
            "Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
             "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
             "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
        }
    ]
"Link_Count": 6
}
    {
    "Country": "UK",
    "Years": [
        {
            "Year": "2009",
            "Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
             "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
             "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
        },
        {
            "Year": "2011",
            "Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
             "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html"]
        }
    ]
"Link_Count": 5
}

我试图聚合它,但我无法实现我想要的,就像我在输出中给出的那样。这是我的疑问:

db.test.aggregate([
{
"$unwind": "$Location"
},
{
    "$group" : {
        "_id": {
            "Country": "$Location",
            "Year": "$Date",
            "Links": "$Link"
        },
        Loc: {
            $addToSet: "$Location"
        }
    }
},
{
    "$unwind": "$Loc"
},
{
    "$group": {
        "_id": "$Loc",
        "Years": { "$push": {
            "Year": "$_id.Year",
            "Links": "$_id.Links"
            }
        }
    }
}
]).toArray()

我在$ Location上使用了$ unwind和$ addToSet,因为在$ Location中找到了重复项。我愿意接受任何建议或解决方案,所以请告诉我!提前谢谢!

1 个答案:

答案 0 :(得分:0)

您可以使用:

db.test.aggregate([{
    "$unwind": "$Location"
}, {
    "$unwind": "$Date"
}, {
    "$unwind": "$Link"
}, {
    "$group": {
        "_id": {
            "Country": "$Location",
            "Year": "$Date"
        },
        Links: {
            $addToSet: "$Link"
        }
    }
}, {
    "$group": {
        "_id": "$_id.Country",
        Years: {
            $push: {
                "Year": "$_id.Year",
                "Links": "$Links"
            }
        },
        Link_Count: { $sum: { $size: "$Links" } }
    }
}])

我们的想法是$unwind所有数组能够$push链接到新数组,并使用$size计算最后$group阶段的分组记录