计算嵌套数组中字段的总和,计数和平均值-MongoDB

时间:2020-02-26 20:48:17

标签: mongodb mongodb-query aggregation-framework

我需要从有条件的集合中获取文档:

last_updated -gte ISODate("2020-02-26T22:1o:55.364Z")

输入集合名称:强度日志

示例文档:

[
  {
    junction_id:"J1",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  },
  {
    junction_id:"J2",
    intensities: [
      {
        lane_id: "L1",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 0.9,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      },
      {
        lane_id: "L2",
        data: [
          {
            intensity: 1,
            last_updated: ISODate("2020-02-26T22:15:55.364Z")
          },
          {
            intensity: 2.1,
            last_updated: ISODate("2020-02-26T22:10:55.364Z")
          },
          {
            intensity: 1.1,
            last_updated: ISODate("2020-02-26T22:05:55.364Z")
          }
        ]
      }
    ]
  }
]

预期输出:

[
    {
        junction_id: "J1",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    },
    {
        junction_id: "J2",
        data: [
            {
                lane_id: "L1",
                sum: 2,
                count: 2,
                avg: 1
            },
            {
                lane_id: "L2",
                sum: 2,
                count: 2,
                avg: 1
            }
        ]
    }
]

1 个答案:

答案 0 :(得分:1)

您可以尝试以下查询:

db.intensity_log.aggregate([
    /** match only docs where there is last_updated > given time, which reduces data size */
    { $match: { 'intensities.data.last_updated': { $gte: ISODate("2020-02-26T22:10:55.364Z") } } },
    /** unwinding array to access objects in it */
    { $unwind: '$intensities' },
    /** filtering objects in data array which matches required criteria */
    { $addFields: { 'intensities.data': { $filter: { input: '$intensities.data', cond: { $gte: ['$$this.last_updated', ISODate("2020-02-26T22:10:55.364Z")] } } } } },
    /** adding required fields into an object named data */
    {
        $addFields: {
            'data.count': { $size: '$intensities.data' },
            'data.sum': {
                $reduce: {
                    input: '$intensities.data',
                    initialValue: 0,
                    in: {
                        $add: ["$$value", "$$this.intensity"]
                    }
                }
            }
        }
    },
    /** adding avg field & extracting lane_id from intensities to data */
    { $addFields: { 'data.avg': { $divide: ["$data.sum", '$data.count'] }, 'data.lane_id': '$intensities.lane_id' } },
    /** Grouping on junction_id & pushing data field created on above stages */
    { $group: { _id: '$junction_id', data: { $push: '$data' } } },
    /** converting _id field name to junction_id & removing _id field from output */
    { $project: { _id: 0, junction_id: '$_id', data: 1 } }
])

注意:您可以通过对数组字段进行两次展开来完成相同的操作,但是它可能会爆炸集合文档,并且可能会影响庞大的数据集,因此这样做会更好,因为此查询将在每个阶段之后收集的文档数量相同,甚至更少。

测试: MongoDB-Playground