如果我有一组对象,每个对象具有相同的描述,但具有不同的数量。
{
{
"_id": "101",
"description": "DD from my employer1",
"amount": 1000.33
},
{
"_id": "102",
"description": "DD from my employer1",
"amount": 1000.34
},
{
"_id": "103",
"description": "DD from my employer1",
"amount": 999.35
},
{
"_id": "104",
"description": "DD from my employer1"",
"amount": 5000.00
},
{
"_id": "105",
"description": "DD from my employer2",
"amount": 2000.01
},
{
"_id": "106",
"description": "DD from my employer2",
"amount": 1999.33
},
{
"_id": "107",
"description": "DD from my employer2",
"amount": 1999.33
}
}
下面,我可以使用以下内容对它们进行分组:
{
{
"$group": {
"_id": {
"$subtract": [
{
"$trunc": "$amount"
},
{
"$mod": [
{
"$trunc": "$amount"
},
10
]
}
]
},
"results": {
"$push": "$_id"
}
}
},
{
"$redact": {
"$cond": {
"if": {
"$gt": [
{
"$size": "$results"
},
1
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}
},
{
"$unwind": "$results"
},
{
"$group": {
"_id": null,
"results": {
"$push": "$results"
}
}
}
}
是否有办法包括组中的所有金额(_ids:101,102和103加105,106,107),即使它们有一个小的差异,但排除奖金金额,在上面的样本中为_id 104?
我正在寻找只有_ids的简单数组输出。
寻找以下结果:
{ "result": [ "101", "102", "103", "105", "106", "107" ] }
答案 0 :(得分:0)
我认为这对实际数据来说有点主观,但如果它只是与“平均”付款的显着“正”差异,那么这是最适用的算法:
db.collection.aggregate([
{ "$group": {
"_id": "$description",
"avg": { "$avg": "$amount" },
"docs": { "$push": { "_id": "$_id", "amount": "$amount" } }
}},
{ "$addFields": {
"docs": {
"$filter": {
"input": "$docs",
"as": "doc",
"cond": {
"$gt": [ "$avg", { "$subtract": [ "$$doc.amount", "$avg" ] } ]
}
}
}
}},
{ "$unwind": "$docs" },
{ "$group": {
"_id": null,
"results": { "$push": "$docs._id" }
}}
])
根据您提供的数据,这将排除"104"
金额,因为金额与“雇主1”的平均金额之差大于平均值本身。这将是一个大的“向上”变化的情况。
与依赖于在分组文档中创建数组的所有“分组”方法一样,您需要在现实场景中小心不要破坏BSON限制。