Question

如果我有一组对象，每个对象具有相同的描述，但具有不同的数量。

{

    {
    "_id": "101",
    "description": "DD from my employer1",
    "amount": 1000.33
    },
    {
    "_id": "102",
    "description": "DD from my employer1",
    "amount": 1000.34
    },
    {
    "_id": "103",
    "description": "DD from my employer1",
    "amount": 999.35
    },
    {
    "_id": "104",
    "description": "DD from my employer1"",
    "amount": 5000.00
    },
    {
    "_id": "105",
    "description": "DD from my employer2",
    "amount": 2000.01
    },
    {
    "_id": "106",
    "description": "DD from my employer2",
    "amount": 1999.33
    },
    {
    "_id": "107",
    "description": "DD from my employer2",
    "amount": 1999.33
    }

}

下面，我可以使用以下内容对它们进行分组：

{
{
    "$group": {
        "_id": {
            "$subtract": [
                {
                    "$trunc": "$amount"
                },
                {
                    "$mod": [
                        {
                            "$trunc": "$amount"
                        },
                        10
                    ]
                }
            ]
        },
        "results": {
            "$push": "$_id"
        }
    }
},
{
    "$redact": {
        "$cond": {
            "if": {
                "$gt": [
                    {
                        "$size": "$results"
                    },
                    1
                ]
            },
            "then": "$$KEEP",
            "else": "$$PRUNE"
        }
    }
},
{
    "$unwind": "$results"
},
{
    "$group": {
        "_id": null,
        "results": {
            "$push": "$results"
        }
    }
}
}

是否有办法包括组中的所有金额（_ids：101,102和103加105,106,107），即使它们有一个小的差异，但排除奖金金额，在上面的样本中为_id 104？

我正在寻找只有_ids的简单数组输出。

寻找以下结果：

{ "result": [ "101", "102", "103", "105", "106", "107" ] }

Answer 1

我认为这对实际数据来说有点主观，但如果它只是与“平均”付款的显着“正”差异，那么这是最适用的算法：

db.collection.aggregate([
  { "$group": {
    "_id": "$description",
    "avg": { "$avg": "$amount" },
    "docs": { "$push": { "_id": "$_id", "amount": "$amount" } }
  }},
  { "$addFields": {
    "docs": {
      "$filter": {
        "input": "$docs",
        "as": "doc",
        "cond": {
          "$gt": [ "$avg", { "$subtract": [ "$$doc.amount", "$avg" ] } ]
        }
      }
    }
  }},
  { "$unwind": "$docs" },
  { "$group": {
    "_id": null,
    "results": { "$push": "$docs._id" }
  }}
])

根据您提供的数据，这将排除"104"金额，因为金额与“雇主1”的平均金额之差大于平均值本身。这将是一个大的“向上”变化的情况。

与依赖于在分组文档中创建数组的所有“分组”方法一样，您需要在现实场景中小心不要破坏BSON限制。

从组中删除最大差异

1 个答案: