如何使用MongoDB对文档中的多维数组求和

时间:2018-08-09 15:49:19

标签: arrays mongodb aggregation-framework

说我有很多以下文件:

{
  _id: “abc”,
  values: {
    0: { 0: 999999, 1: 999999, …, 59: 1000000 },
    1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
    …,
    58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
    59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
  }
}
{
  _id: “def”,
  values: {
    0: { 0: 999999, 1: 999999, …, 59: 1000000 },
    1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
    …,
    58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
    59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
  }
}

基本上是60x60项目的多维数组。 聚合(或任何其他mongodb构造)可以用来轻松地将两个(或多个)矩阵求和吗?即values[x][y]abc中的def被加在一起,并且对所有其他元素都一样吗? 理想情况下,输出将是类似的多维数组。

这个answer似乎暗示使用一维数组是可能的,但我不确定多维。

编辑:

这是一个实际数据格式略有不同的示例:

db.col.find({}, { _id: 0, hit: 1 })
{ "hit" : [ [ 570, 0, 630, 630, 636, 735, 672, 615, 648, 648, 618, 0 ], 
[ 492, 0, 471, 471, 570, 564, 0, 590, 513, 432, 471, 477 ], 
[ 387, 0, 0, 0, 0, 0, 0, 456, 0, 480, 351, 415 ], 
[ 432, 528, 0, 0, 495, 509, 0, 579, 0, 552, 0, 594 ], 
[ 558, 603, 594, 624, 672, 0, 0, 705, 783, 0, 756, 816 ], 
[ 0, 858, 951, 1027, 0, 0, 1058, 1131, 0, 0, 1260, 1260 ], 
[ 1269, 0, 1287, 0, 1326, 0, 1386, 1386, 1470, 0, 0, 0 ], 
[ 1623, 0, 1695, 1764, 1671, 1671, 0, 1824, 1872, 0, 0, 0 ], 
[ 1950, 1894, 2034, 2034, 0, 0, 1941, 0, 2070, 1911, 2049, 2055 ], 
[ 2052, 2052, 0, 0, 0, 2085, 2007, 2073, 0, 0, 0, 1941 ], 
[ 1878, 1896, 0, 1875, 0, 0, 1677, 0, 1722, 0, 1545, 0 ], 
[ 0, 0, 1317, 1469, 1501, 1634, 1494, 0, 0, 1290, 0, 0 ], 
[ 0, 1485, 1375, 1491, 1530, 1407, 0, 0, 0, 1611, 0, 0 ], 
[ 1652, 1800, 1686, 1643, 1923, 0, 0, 0, 1737, 1604, 1797, 0 ], 
[ 1842, 1806, 0, 1830, 1896, 1947, 0, 1710, 1734, 1725, 0, 0 ], 
[ 0, 0, 1932, 0, 1908, 1878, 1941, 1931, 2007, 2013, 1995, 1995 ], 
[ 0, 2025, 2004, 1927, 0, 0, 1939, 1835, 1962, 1863, 0, 1815 ], 
[ 0, 0, 1839, 1755, 1821, 1821, 1751, 1656, 0, 0, 1467, 0 ], 
[ 0, 1632, 1546, 1449, 0, 1551, 1449, 0, 0, 1554, 0, 1491 ], 
[ 1463, 1411, 0, 1491, 0, 0, 1551, 1467, 0, 0, 0, 1464 ], 
[ 0, 0, 1311, 0, 0, 1471, 0, 0, 1581, 0, 1368, 1368 ], 
[ 1296, 0, 0, 0, 1176, 1381, 0, 1170, 1194, 1194, 1193, 1137 ], 
[ 0, 1244, 1221, 1039, 0, 1041, 930, 921, 1033, 813, 0, 0 ], 
[ 0, 0, 0, 1010, 0, 0, 918, 783, 0, 609, 693, 645 ] ] }

这是合适的查询(感谢Veeram在修复我的代码的注释中):

db.col.aggregate([ 
{ $project: { _id: 0, hit: 1 } }, 
{ $unwind: { path: "$hit", includeArrayIndex: "x" } }, 
{ $unwind: { path: "$hit", includeArrayIndex: "y" } }, 
{ $group: { _id: { x: "$x", y: "$y" }, hit: { $sum: "$hit" } } }, 
{ $sort: { "_id.x": 1, "_id.y": 1 } }, 
{ $group: { _id: "$_id.x", hit: { $push: "$hit" } } }, 
{ $sort: { "_id": 1 } }, 
{ $group: { _id: null, hit: { $push: "$hit" } } } 
])

1 个答案:

答案 0 :(得分:1)

您需要两个运算符来处理动态属性:$objectToArray$arrayToObject。要汇总所有文档中的值,您可以尝试将每个x,y对表示为单个文档(使用$unwind),然后使用几个$group阶段来获得单个文档。要获得行和列的初始顺序,可以两次应用$sort

db.col.aggregate([
    {
        $project: {
            values: {
                $map: {
                    input: { $objectToArray: "$values" },
                    as: "obj",
                    in: { k: "$$obj.k", v: { $objectToArray: "$$obj.v" } }
                }
            }
        }
    },
    {
        $unwind: "$values"
    },
    {
        $unwind: "$values.v"
    },
    {
        $project: {
            x: "$values.k",
            y: "$values.v.k",
            value: "$values.v.v"
        }
    },
    {
        $group: {
            _id: { x: "$x", y: "$y" },
            value: { $sum: "$value" }
        }
    },
    {
        $sort: {
            "_id.y": 1
        }
    },
    {
        $group: {
            _id: "$_id.x",
            v: { $push: { k: "$_id.y", v: "$value" } }
        }
    },
    {
        $sort: {
            "_id": 1
        }
    },
    {
        $group: {
            _id: null,
            values: { $push: { k: "$_id", v: "$v" } }
        }
    },
    {
        $project: {
            values: {
                $arrayToObject: {
                    $map: {
                        input: "$values",
                        as: "obj",
                        in: {
                            k: "$$obj.k",
                            v: { $arrayToObject: "$$obj.v" }
                        }
                    }
                }
            }
        }
    }
])

对于您的示例数据输出:

{
    "_id" : null,
    "values" : {
            "0" : {
                    "0" : 1999998,
                    "1" : 1999998,
                    "59" : 2000000
            },
            "1" : {
                    "0" : 4000000,
                    "1" : 4000000,
                    "59" : 2000000
            },
            "58" : {
                    "0" : 3200000,
                    "1" : 2400000,
                    "59" : 2200000
            },
            "59" : {
                    "0" : 2600000,
                    "1" : 2800000,
                    "59" : 3000000
            }
    }

}