Question

我的文档看起来像这样：

{
  {
    "id": "5c4a208bbddbb7c2ae231584",
    "sections": [
      {
        "id": "5c4a208bbddbb7c2ae231576",
        "subsections": [
          {
            "isReviewed": true,
            "flags": {
              "types": [
                "UNORIGINAL",
                "SIMILAR"
              ],
              "predicted": [
                "UNORIGINAL",
                "SIMILAR"
              ]
            }
          }
        ]
      }
    ]
  }

  {
    "id": "5c4a208bbddbb7c2ae231585",
    "sections": [
      {
        "id": "5c4a208bbddbb7c2ae231580",
        "subsections": [
          {
            "isReviewed": false,
            "flags": {
              "types": [
                "SIMILAR"
              ],
              "predicted": []
            }
          }
        ]
      }
    ]
  }
}

对于本文档，我想使用其isReviewed字段及其subsections的组合来计算每个文档的状态。

文档是

STATUS_REVIEWED，如果在其predicted中具有项目的 all 小节均为isReviewed = true
STATUS_PARTIAL，如果在其predicted中有一些小节的是isReviewed = true
STATUS_UNREVIEWED，如果在其predicted中具有项目的 all 小节均为isReviewed = false
STATUS_IRRELEVANT（如果其所有小节中的predicted中都不包含项目）

此数据的正确结果如下所示：

所需结果

[
  {
    "id": "5c4a208bbddbb7c2ae231584",
    // All of its subsections are reviewed and has items in predicted
    "reviewStatus": ["STATUS_REVIEWED"]
  },
    "id": "5c4a208bbddbb7c2ae231585",
    // All of its subsections do not have any items in predicted
    "reviewStatus": ["STATUS_IRRELEVANT"]
  },
]

我目前有一个可以运行的聚合查询，但是需要使用JavaScript进行一些后期处理以弥补未返回的文档。

  Item.aggregate()
    .match({ _id: { $in: ids } })
    .project({ a: '$sections.subsections' })
    .unwind('$a')
    .unwind('$a')
    .project({
      isReviewed: '$a.isReviewed',
      hasPredictions: { $gt: [{ $size: '$a.flags.predicted' }, 0] }
    })
    .match({ hasPredictions: true })
    .group({ _id: '$_id', reviewStatuses: { $addToSet: '$isReviewed' } })
    .group({
      _id: '$_id',
      reviewStatus: {
        $push: {
          $switch: {
            branches: [
              {
                case: {
                  $and: [
                    { $in: [false, '$reviewStatuses'] },
                    { $in: [true, '$reviewStatuses'] },
                  ]
                },
                then: 'STATUS_PARTIAL'
              },
              {
                case: {
                  $eq: [[true], '$reviewStatuses']
                },
                then: 'STATUS_REVIEWED'
              },
              {
                case: {
                  $eq: [[false], '$reviewStatuses']
                },
                then: 'STATUS_UNREVIEWED'
              }
            ],
            default: 'STATUS_UNKNOWN'
          }
        }
      }
    })
    // Post processing
    .then((items) => {
      const itemsById = _.keyBy(items, '_id');
      return itemIds.map((itemId) => {
        const status = itemsById[itemId];
        // Set to STATUS_IRRELEVANT item is not in result
        if (status === undefined) return 'STATUS_IRRELEVANT';
        return status.reviewStatus[0];
      });
    })

此查询的效果很好，但是如果文档中没有predicted中带有项目的任何小节，则匹配步骤将忽略结果中的文档，并要求我事后进行后处理以检查该文档是否在结果中（然后添加“ STATUS_IRRELEVANT”）。

我能得到的

[
  {
    "id": "5c4a208bbddbb7c2ae231584",
    // All of its subsections are reviewed and items in predicted
    "reviewStatus": ["STATUS_REVIEWED"]
  }
  // Doc "5c4a208bbddbb7c2ae231585" is missing here because of the match
]

如何重写此查询以提供所需的结果？

具有深层嵌套数组的Mongo过滤聚合

0 个答案: