mongodb group by multiple keys values vise versa

时间:2015-10-28 06:44:09

标签: php mongodb laravel mongodb-query aggregation-framework

我有一个包含以下数据的用户集合

    [
  {
    "user_id": "5625c95ac2d34f27148b64fa",
    "friend_id": "561f40bac2d34f17148b462c"
  },
  {
    "user_id": "562744ccc2d34f27148b6eb7",
    "friend_id": "561f40bac2d34f17148b462c"
  },
  {
    "user_id": "56248eb9c2d34f2f148b5a18",
    "friend_id": "561f40bac2d34f17148b462c"
  },
  {
    "user_id": "561f40bac2d34f17148b462c",
    "friend_id": "561f3e06c2d34f27148b45f6"
  },
  {
    "user_id": "561f40bac2d34f17148b462c",
    "friend_id": "5620de97c2d34f2f148b578f"
  },
  {
    "user_id": "56276b52c2d34f27148b7128",
    "friend_id": "561f40bac2d34f17148b462c"
  },
  {
    "user_id": "561f40bac2d34f17148b462c",
    "friend_id": "56276b52c2d34f27148b7128"
  }
]

我需要获取未重复user_idfriend_id组合的文档。即在上面的示例中,在下一个文档的user_id中重复了两个文档friend_id

我尝试使用mongo aggrigate和group by但无法减少它。

1 个答案:

答案 0 :(得分:3)

为了做到这一点,您基本上需要将Props.Createuser_id值组合在一个唯一排序的组合中。这意味着为每个文档创建一个包含这些成员的数组,并对该数组进行排序,以使顺序始终相同。

然后,您可以对排序后的数组内容friend_id查看哪些文档包含相同的组合,然后只返回那些不共享相同组合的文档。

这导致了这个汇总声明:

$group

laravel的PHP转换意味着需要从管理器访问原始集合对象,其中“collection”是MongoDB中集合的实际名称:

db.collection.aggregate([
    { "$project": {
        "user_id": 1,
        "friend_id": 1,
        "combined": {
            "$map": {
                "input": ["A","B"],
                "as": "el",
                "in": {
                    "$cond": [ 
                        { "$eq": [ "$$el", "A" ] },
                        "$user_id",
                        "$friend_id"
                    ]
                }
            }
        }            
    }},
    { "$unwind": "$combined" },
    { "$sort": { "combined": 1 } },
    { "$group": {
        "_id": "$_id",
        "combined": { "$push": "$combined" },
        "user_id": { "$first": "$user_id" },
        "friend_id": { "$first": "$friend_id"  }
    }},
    { "$group": {
        "_id": "$combined",
        "docs": { "$push": {
            "_id": "$_id",
            "user_id": "$user_id",
            "friend_id": "$friend_id"
        }}
    }},
    { "$redact": {
        "$cond": {
            "if": { "$ne": [{ "$size": "$docs" }, 1] },
            "then": "$$PRUNE",
            "else": "$$KEEP"
        }
    }}
])

或者如果您的MongoDB版本低于2.6,并且您缺少$result = DB::collection("collection")->raw(function($collection) { return $collection->aggregate( array( array( '$project' => array( 'user_id' => 1, 'friend_id' => 1, 'combined' => array( '$map' => array( 'input' => array("A","B"), 'as' => 'el', 'in' => array( '$cond' => array( array( '$eq' => array( '$el', 'A' ) ), '$user_id', '$friend_id' ) ) ) ) ) ), array( '$unwind' =>'$combined' ), array( '$sort' => array( 'combined' => 1 ) ), array( '$group' => array( '_id' => '$_id', 'combined' => array( '$push' => '$combined' ), 'user_id' => array( '$first' => '$user_id' ), 'friend_id' => array( '$first' => '$friend_id' ) ) ), array( '$group' => array( '_id' => '$combined', 'docs' => array( '$push' => array( '_id' => '$_id', 'user_id' => '$user_id', 'friend_id' => 'friend_id' ) ) ) ), array( '$redact' => array( '$cond' => array( 'if' => array( '$ne' => array( array( '$size' => '$docs'), 1) ), 'then' => '$$PRUNE', 'else' => '$$KEEP' ) ) ) ) ); }); $map等运算符,那么您仍然可以执行此操作,但效率不高:

$redact

前三个阶段通过将两个值都放在一个数组中来模仿第一个阶段在第一个示例列表中所做的事情。当然最后两个阶段是在对分组进行“计数”,然后过滤掉没有“{1}}”计数的任何内容。

在任何一种情况下,这都会为您留下输出,该输出仅列出不按任何顺序出现该组合的文档:

$result = DB::collection("collection")->raw(function($collection) {
    return $collection->aggregate(
        array(
            array(
                '$project' => array(
                    'user_id' => 1,
                    'friend_id' => 1,
                    'type' => array( '$const' => array( 'A', 'B' ) )
                )
            ),
            array( '$unwind' => '$type' ),
            array(
                '$group' => array(
                    '_id' => '$_id',
                    'user_id' => array( '$first' => '$user_id' ),
                    'friend_id' => array( '$first' => '$friend_id' ),
                    'combined' => array( 
                        '$push' => array(
                            '$cond' => array(
                                array( '$eq' => array( '$type', 'A' ) ),
                                '$user_id',
                                '$friend_id'
                            )
                        )
                    )
                )
            )
            array( '$unwind' =>'$combined' ),
            array( '$sort' => array( 'combined' => 1 ) ),
            array(
                '$group' => array(
                    '_id' => '$_id',
                    'combined' => array( '$push' => '$combined' ),
                    'user_id' => array( '$first' => '$user_id' ),
                    'friend_id' => array( '$first' => '$friend_id' )
                )
            ),
            array(
                '$group' => array(
                    '_id' => '$combined',
                    'docs' => array(
                        '$push' => array(
                            '_id' => '$_id',
                            'user_id' => '$user_id',
                            'friend_id' => 'friend_id'
                        )
                    ),
                    'count' => array( '$sum' => 1 )
                )
            ),
            array( '$match' => array( 'count' => 1 ) )
        )
    );
});

您可以对输出进行调整,但这样做的目的是显示与原始文档数据一起使用的有序组合。