Question

我有以下文件：

{
   "from":"abc@sss.ddd",
   "to" :"ssd@dff.dff",
   "email": "Hi hello"
}

我们如何计算“从和到”或“来往”的和的数量？就像两个人之间的沟通计数一样？

我能够计算单向总和。我希望两种方式相加。

db.test.aggregate([
      { $group: {
         "_id":{ "from": "$from", "to":"$to"},
           "count":{$sum:1} 
         }
      },
      { 
        "$sort" :{"count":-1}
      }
])

Answer 1

由于您需要计算在 2地址之间交换的电子邮件数量，因此将统一的between字段投影如下是公平的：

db.a.aggregate([
    { $match: {
        to: { $exists: true },
        from: { $exists: true },
        email: { $exists: true }
    }}, 
    { $project: {
        between: { $cond: { 
            if: { $lte: [ { $strcasecmp: [ "$to", "$from" ] }, 0 ] }, 
            then: [ { $toLower: "$to" }, { $toLower: "$from" } ], 
            else: [ { $toLower: "$from" }, { $toLower: "$to" } ] }
        } 
    }},
    { $group: {
         "_id": "$between",
         "count": { $sum: 1 } 
    }},
    { $sort :{ count: -1 } }
])

统一逻辑应该从示例中非常清楚：它是两个电子邮件的按字母顺序排序的数组。如果您信任您的数据，$match和$toLower部分是可选的。

示例中使用的运算符的文档：

Answer 2

你基本上需要考虑将_id分组为可能的“to”和“from”值的“数组”，然后当然“排序”它们，以便在每个文档中组合总是按照相同的顺序。

正如旁注，我想在处理这样的消息传递系统时添加“通常”，“to”和“from”发件人/收件人通常都是以数组开头的，所以它通常形成本声明的不同变体来源的基础。

首先，针对单个地址的最佳MongoDB 3.2语句

db.collection.aggregate([
    // Join in array
    { "$project": {
        "people": [ "$to", "$from" ],
    }},

    // Unwind array
    { "$unwind": "$people" },

    // Sort array
    { "$sort": { "_id": 1, "people": 1 } },

    // Group document
    { "$group": {
        "_id": "$_id",
        "people": { "$push": "$people" }
    }},

    // Group people and count
    { "$group": {
        "_id": "$people",
        "count": { "$sum": 1 }
    }}
]);

这是基础，现在唯一的变化是构建“人”阵列（仅在上面的第1阶段）。

MongoDB 3.x和2.6.x - 数组

{ "$project": {
    "people": { "$setUnion": [ "$to", "$from" ] }
}}

MongoDB 3.x和2.6.x - 数组字段

{ "$project": {
    "people": { 
        "$map": {
            "input": ["A","B"],
            "as": "el",
            "in": {
               "$cond": [
                   { "$eq": [ "A", "$$el" ] },
                   "$to",
                   "$from"
               ]
            }
        }
    }
}}

MongoDB 2.4.x和2.2.x - 来自字段

{ "$project": {
    "to": 1,
    "from": 1,
    "type": { "$const": [ "A", "B" ] }
}},
{ "$unwind": "$type" },
{ "$group": {
    "_id": "$_id",
    "people": {
        "$addToSet": {
            "$cond": [
                { "$eq": [ "$type", "A" ] },
                "$to",
                "$from"
            ]
        }
    }
}}

但在所有情况下：

将所有收件人分成不同的数组。
按顺序排列数组
在“始终以相同顺序”的收件人列表中进行分组。

遵循这一点，你就不会出错。

Mongo Group和两个领域相加

2 个答案: