说我有一个小的数据集:
[
{"A": 0, "B": 0, "X": 100, "Y": 100},
{"A": 1, "B": 0, "X": 50, "Y": 55},
{"A": 0, "B": 1, "X": 25, "Y": 30},
{"A": 1, "B": 1, "X": 1, "Y": 6}
]
我还有一个管道,其中最后一个阶段是一个小组:
[
{
"$group": {
"_id": {
"classification1": {
"$eq": ["$A", 1]
},
"classification2": {
"$eq": ["$B", 1]
}
},
"countX": {"$sum": "$X"},
"countY": {"$sum": "$Y"}
}
}
]
此管道的输出:
[
{"_id": {"classification1": false, "classification2": false}, "countX": 100, "countY": 100},
{"_id": {"classification1": true, "classification2": false}, "countX": 50, "countY": 55},
{"_id": {"classification1": false, "classification2": true}, "countX": 25, "countY": 30},
{"_id": {"classification1": true, "classification2": true}, "countX": 1, "countY": 6}
]
要达到这种融合格式,我需要采取哪些流水线步骤?
[
{"name": "classification1", "countX": 51, "countY": 61},
{"name": "classification2", "countX": 26, "countY": 36}
]
请注意,此转换对上一阶段的文档1进行零次计数,并对文档4进行两次计数(因为两个条件都为假,或者都为真)。
我为此编写了一个Javascript函数,但是不能从管道中调用Javascript函数(聚合管道必须可序列化)。不幸的是,这意味着我必须从数据库中卸载数据,对数据运行脚本,然后将转换后的数据作为临时集合加载回去,以在此阶段之后完成其余的管道。
非常感谢您的协助。
答案 0 :(得分:0)
我在各个方面做了一些阅读。有点冗长,但是此查询以正确的格式提供了融化的数据:
[
{
"$group": {
"_id": {
"classification1": {
"$eq": ["$A", 1]
},
"classification2": {
"$eq": ["$B", 1]
}
},
"countX": {"$sum": "$X"},
"countY": {"$sum": "$Y"}
}
},
{
"$facet": {
"classification1": [
{"$match": {"_id.classification1": true}},
{"$group": {"_id": null, "X": {"$sum": "$countX"}, "Y": {"$sum": "$countY"}}},
{"$addFields": {"name": "classification1"}}
],
"classification2": [
{"$match": {"_id.classification2": true}},
{"$group": {"_id": null, "X": {"$sum": "$countX"}, "Y": {"$sum": "$countY"}}},
{"$addFields": {"name": "classification2"}}
]
}
},
{
"$project": {"combine": {"$setUnion": ["$classification1", "$classification2"]}}
},
{
"$unwind": "$combine"
},
{
"$replaceRoot": {"newRoot": "$combine"}
},
{
"$project": {"_id": 0}
}
]