我的文档看起来像这样:
{
{
"id": "5c4a208bbddbb7c2ae231584",
"sections": [
{
"id": "5c4a208bbddbb7c2ae231576",
"subsections": [
{
"isReviewed": true,
"flags": {
"types": [
"UNORIGINAL",
"SIMILAR"
],
"predicted": [
"UNORIGINAL",
"SIMILAR"
]
}
}
]
}
]
}
{
"id": "5c4a208bbddbb7c2ae231585",
"sections": [
{
"id": "5c4a208bbddbb7c2ae231580",
"subsections": [
{
"isReviewed": false,
"flags": {
"types": [
"SIMILAR"
],
"predicted": []
}
}
]
}
]
}
}
对于本文档,我想使用其isReviewed
字段及其subsections
的组合来计算每个文档的状态。
文档是
STATUS_REVIEWED
,如果在其predicted
中具有项目的 all 小节均为isReviewed = true
STATUS_PARTIAL
,如果在其predicted
中有一些小节的是isReviewed = true
STATUS_UNREVIEWED
,如果在其predicted
中具有项目的 all 小节均为isReviewed = false
STATUS_IRRELEVANT
(如果其所有小节中的predicted
中都不包含项目)此数据的正确结果如下所示:
所需结果
[
{
"id": "5c4a208bbddbb7c2ae231584",
// All of its subsections are reviewed and has items in predicted
"reviewStatus": ["STATUS_REVIEWED"]
},
"id": "5c4a208bbddbb7c2ae231585",
// All of its subsections do not have any items in predicted
"reviewStatus": ["STATUS_IRRELEVANT"]
},
]
我目前有一个可以运行的聚合查询,但是需要使用JavaScript进行一些后期处理以弥补未返回的文档。
Item.aggregate()
.match({ _id: { $in: ids } })
.project({ a: '$sections.subsections' })
.unwind('$a')
.unwind('$a')
.project({
isReviewed: '$a.isReviewed',
hasPredictions: { $gt: [{ $size: '$a.flags.predicted' }, 0] }
})
.match({ hasPredictions: true })
.group({ _id: '$_id', reviewStatuses: { $addToSet: '$isReviewed' } })
.group({
_id: '$_id',
reviewStatus: {
$push: {
$switch: {
branches: [
{
case: {
$and: [
{ $in: [false, '$reviewStatuses'] },
{ $in: [true, '$reviewStatuses'] },
]
},
then: 'STATUS_PARTIAL'
},
{
case: {
$eq: [[true], '$reviewStatuses']
},
then: 'STATUS_REVIEWED'
},
{
case: {
$eq: [[false], '$reviewStatuses']
},
then: 'STATUS_UNREVIEWED'
}
],
default: 'STATUS_UNKNOWN'
}
}
}
})
// Post processing
.then((items) => {
const itemsById = _.keyBy(items, '_id');
return itemIds.map((itemId) => {
const status = itemsById[itemId];
// Set to STATUS_IRRELEVANT item is not in result
if (status === undefined) return 'STATUS_IRRELEVANT';
return status.reviewStatus[0];
});
})
此查询的效果很好,但是如果文档中没有predicted
中带有项目的任何小节,则匹配步骤将忽略结果中的文档,并要求我事后进行后处理以检查该文档是否在结果中(然后添加“ STATUS_IRRELEVANT”)。
我能得到的
[
{
"id": "5c4a208bbddbb7c2ae231584",
// All of its subsections are reviewed and items in predicted
"reviewStatus": ["STATUS_REVIEWED"]
}
// Doc "5c4a208bbddbb7c2ae231585" is missing here because of the match
]
如何重写此查询以提供所需的结果?