Mongodb Distinct Aggregation

时间:2014-08-05 21:50:40

标签: php mongodb distinct aggregation-framework

我尝试使用聚合框架在mongo中执行组计数,但结果并不完全符合预期。

考虑下面的集合

 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "3", "day" => "Monday", 'age' => 24));
 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Monday", 'age' => 25));
 $people->insert(array("user_id" => "4", "day" => "Monday", 'age' => 33));
 $people->insert(array("user_id" => "1", "day" => "Tuesday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Tuesday", 'age' => 25));
 $people->insert(array("user_id" => "1", "day" => "Wednesday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Thursday", 'age' => 25));
 $people->insert(array("user_id" => "1", "day" => "Friday", 'age' => 18));

我使用下面的查询尝试计算一周中每一天的不同条目数(user_id)。

$query = array(
        array(
            '$project' => array(
                'user_id' =>1,
                'day' =>1,
            ),
        ),
        array(
            '$group' => array(
                '_id'  => array(
                    'user_id' => '$user_id',
                 'day' => '$day'),
                'count' => array('$sum' => 1),
            )
        ));

所以对于上面的集合,结果应该是

Monday = 3     Tues = 2,     Wed = 1,     Thur = 1 and    Friday = 1

但它不会在一天内对所有DISTINCT users_id的总计进行分组,而是每天为每个现有的user_id分配一个总数。

结果(未完成)

     [result] => Array
    (
        [0] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 1
                        [day] => Friday
                    )

                [count] => 1
            )

        [1] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 1
                        [day] => Wednesday
                    )

                [count] => 1
            )

        [2] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 2
                        [day] => Tuesday
                    )

                [count] => 1
            )
... ... ...

有人可以帮助我过滤每日总数,使其每天只包含不同的总数

我看过$unwind,但无法真正理解它。 `

1 个答案:

答案 0 :(得分:2)

如果我理解正确的问题,那么你想要的是

 totals of all DISTINCT users_id under a day

或者据我了解:每天都有唯一的user_ids计数。

为此,您可以使用已有的群组并减少计数,以便您拥有唯一的_id.user_id_id.day值:

'$group' => array(
            '_id'  => array(
                'user_id' => '$user_id',
                'day' => '$day'
            )
        )

然后将其传输到另一个$group语句,该语句计算每天的文档数量,因为每个唯一的user_id / day组合只有一个:

'$group' => array(
            '_id'  => '$_id.day',
            'count' => array('$sum' => 1)
        )