mongodb计数文件对

时间:2018-01-21 13:00:25

标签: javascript mongodb aggregation-framework

让我们假设我的MongoDB集合包含这种类型的元素:

{ 
    "_id" : "id1", 
    "from" : "Tom", 
    "to" : "Bill"
},
{ 
    "_id" : "id2", 
    "from" : "Jack", 
    "to" : "Tom"
},
{ 
    "_id" : "id3", 
    "from" : "Jack", 
    "to" : "Tom"
},
{ 
    "_id" : "id4", 
    "user" : "Tom", 
    "to" : "Jack"
},
{ 
    "_id" : "id4", 
    "user" : "Tom", 
    "to" : "Bill"
},
{ 
    "_id" : "id5", 
    "user" : "Bill", 
    "to" : "Jack"
}

将其视为电子邮件。如何汇总这样的集合以找出哪一对最通信?问题是我们不仅要计算从A到B的邮件,还要从B到A计算邮件。

万分感谢!

2 个答案:

答案 0 :(得分:0)

您可以通过组合用户名为每对生成唯一键,然后您可以使用哈希表来计算:

  const hash = {};

  for(let { from, to, user} of input){
    from = from || user;
    if(from < to)
      ([from, to] = [to, from]);

    const key = from + "©" + to;

    hash[key] = (hash[key] || 0) + 1;
}

现在我们得到了计数数据。我们现在唯一需要做的就是迭代哈希并找到最大的数量:

 let result = null, count = - Infinity;

 for(let pair in hash){
   if(hash[pair] > count){
    count = hash[pair];
    result = pair;
   }
}

答案 1 :(得分:0)

我认为您始终拥有fromto字段。然后你可以将你的数据投射到一个有序数组participants然后$ group by array:

db.mails.aggregate([
{
    $project: {
        _id: 1,
        participants: {
         $cond: { 
            if: { $gte: [ "$from", "$to" ] }, 
            then: [ "$to", "$from" ], 
            else: [ "$from", "$to" ] }
       }
    }
},
{
    $group: {
        _id: "$participants",
        count: { $sum: 1 }
    }
},
{
    $sort: { "count" : -1 }
}
])