Mongo将多个文档汇总为一个

时间:2020-09-26 14:04:19

标签: javascript mongodb aggregate

我正在尝试通过mongodb.collection.aggregate()命令合并一些复杂的文档。

假设我要合并集合文档的x个(在下面的示例中:x = 2):

[
  {
    "_id": 1,
    "Data": {
      "children": {
        "1": {
          "name": "appear_only_in_first_doc",
          "cost": 1,
          "revenue": 4.5,
          "grandchildren": {
            "1t9dsqdqdvoj8pdppxjk": {
              "cost": 0,
              "revenue": 1.5
            }
          }
        },
        "2": {
          "name": "appear_in_both_docs",
          "cost": 2,
          "revenue": 7,
          "grandchildren": {
            "jesrdt5qwef2222dgt": {
              "cost": 1,
              "revenue": 3
            },
            "klh352hk5367kf": {
              "cost": 2,
              "revenue": 7
            }
          }
        }
      }
    }
  },
  {
    "_id": 2,
    "Data": {
      "children": {
        "2": {
          "name": "appear_in_both_docs___but_diff_name",
          "cost": 9,
          "revenue": 7,
          "grandchildren": {
            "aaaaaaaaa": {
              "cost": 3,
              "revenue": 2
            },
            "jesrdt5qwef2222dgt": {
              "cost": 6,
              "revenue": 5
            }
          }
        },
        "3": {
          "name": "appear_only_in_last_doc",
          "cost": 4,
          "revenue": 2,
          "grandchildren": {
            "cccccccccccc": {
              "cost": 4,
              "revenue": 2
            }
          }
        }
      }
    }
  }
]

挑战:

  1. 在编写查询时,“ children”和“ grandchildren”键下的键是动态且未知的。
  2. 如果孩子或孙子仅出现在一个文档中(例如,“ 1”,“ 3”,“ 1t9dsqdqdvoj8pdppxjk”,“ klh352hk5367kf”,“ aaaaaaaaa”和“ cccccccccccc”),则它也应出现在最终结果中。< / li>
  3. 如果一个孩子出现在多个文档中(例如“ 2”和“ jesrdt5qwef2222dgt”),则该孩子应在最终结果中显示为一个。应该将“费用”和“收入”字段加起来,并取最后一个“名称”字段。

我已经看到以下解决方案:

  1. unionWith-不相关,合并2个不同的集合。
  2. merge-不相关,不能对多次出现的字段的值求和(取最后一个)。
  3. mergeObjects-不相关,不能对多次出现的字段的值求和(取最后一个)。

最终结果应如下所示:

{
  "Data": {
    "children": {
      "1": {
        "name": "appear_only_in_first_doc",
        "cost": 1,
        "revenue": 4.5,
        "grandchildren": {
          "1t9dsqdqdvoj8pdppxjk": {
            "cost": 0,
            "revenue": 1.5
          }
        }
      },
      "2": {
        "name": "appear_in_both_docs___but_diff_name",
        "cost": 11,
        "revenue": 14,
        "grandchildren": {
          "aaaaaaaaa": {
            "cost": 3,
            "revenue": 2
          },
          "jesrdt5qwef2222dgt": {
            "cost": 7,
            "revenue": 8
          },
          "klh352hk5367kf": {
            "cost": 2,
            "revenue": 7
          }
        }
      },
      "3": {
        "name": "appear_only_in_last_doc",
        "cost": 4,
        "revenue": 2,
        "grandchildren": {
          "cccccccccccc": {
            "cost": 4,
            "revenue": 2
          }
        }
      }
    }
  }
}

1 个答案:

答案 0 :(得分:1)

这是一个漫长的过程,可能会有一些简单的过程,我只是在分享过程,

  • $projectchildren转换为数组格式(k,v)
  • $unwind解构children数组
  • $group通过子项键,将costrevenue相加,并使用name得到最后一个$last
  • $unwind解构grandchildren数组
  • $addFieldsgrandchildren转换为数组格式(k,v)
  • $unwind解构grandchildren数组
db.collection.aggregate([
  { $project: { "Data.children": { $objectToArray: "$Data.children" } } },
  { $unwind: "$Data.children" },
  {
    $group: {
      _id: "$Data.children.k",
      name: { $last: "$Data.children.v.name" },
      cost: { $sum: "$Data.children.v.cost" },
      revenue: { $sum: "$Data.children.v.revenue" },
      grandchildren: { $push: "$Data.children.v.grandchildren" }
    }
  },
  { $unwind: "$grandchildren" },
  { $addFields: { grandchildren: { $objectToArray: "$grandchildren" } } },
  { $unwind: "$grandchildren" },
  • 通过$group键和childrengrandchildren计算孙子成本和收入之和
  {
    $group: {
      _id: {
        ck: "$_id",
        gck: "$grandchildren.k"
      },
      cost: { $first: "$cost" },
      revenue: { $first: "$revenue" },
      name: { $first: "$name" },
      grandchildren_cost: { $sum: "$grandchildren.v.cost" },
      grandchildren_revenue: { $sum: "$grandchildren.v.revenue" }
    }
  },
  • 通过$groupchildren并重新构建grandchildren数组
  {
    $group: {
      _id: "$_id.ck",
      cost: { $first: "$cost" },
      revenue: { $first: "$revenue" },
      name: { $last: "$name" },
      grandchildren: {
        $push: {
          k: "$_id.gck",
          v: {
            cost: "$grandchildren_cost",
            revenue: "$grandchildren_revenue"
          }
        }
      }
    }
  },
  • $groupnull并重建子数组,并使用grandchildren$arrayToObject转换为(k,v)数组中的对象
  {
    $group: {
      _id: null,
      children: {
        $push: {
          k: "$_id",
          v: {
            name: "$name",
            cost: "$cost",
            revenue: "$revenue",
            grandchildren: { $arrayToObject: "$grandchildren" }
          }
        }
      }
    }
  },
  • $project使用children$arrayToObject转换为对象
  {
    $project: {
      _id: 0,
      "Data.children": { $arrayToObject: "$children" }
    }
  }
])

Playground