如何在php中分组mongodb文件

时间:2018-03-01 15:50:53

标签: php mongodb grouping

我使用mongo server 3.6和php作为后端语言。 此外,我正在使用最新的php-mongo库与新的更新驱动程序进行对话。

我有一个包含1000万条记录的集合,如下所示:

[
  {
    "did": "123456",
    "did_usage": "1",
    "did_timestamp": "15012"
  },
  {
    "did": "4567811",
    "did_usage": "1",
    "did_timestamp": "15013"
  },
  {
    "did": "46465464",
    "did_usage": "2",
    "did_timestamp": "15014"
  },
  {
    "did": "7894446",
    "did_usage": "2",
    "did_timestamp": "15015"
  },
  {
    "did": "65646131",
    "did_usage": "3",
    "did_timestamp": "15016"
  },
  {
    "did": "7989464",
    "did_usage": "2",
    "did_timestamp": "15017"
  },
  {
    "did": "651651664",
    "did_usage": "1",
    "did_timestamp": "15018"
  }.......
]

现在我想找到一个唯一一个使用率最少且时间戳最少的独特文档。

到目前为止,我已经使用以下方法找到了唯一的独特文档:

$sample = array('$sample' => array('size' => 1));
$pipeline = array($match, $group, $project, $sample);
$cursor = $collection->aggregate($pipeline);

我想了解$group的一些帮助。 我试过这个

$group = array('$group' => array('_id' => '$did_usage', 'did_usage_timestamp' => array('$min' => '$did_usage_timestamp')));

但这并没有像预期的那样发挥作用。

1 个答案:

答案 0 :(得分:2)

即使不知道原始代码中$match$project正在做什么,我们也可以假设$group正在对集合数据的子集进行操作,因为它前面只有{ {1}}在管道中。仅基于示例集合数据和$match阶段,显然与$group运算符一起使用的$dig_usage_timestamp field path引用了进入该文档的文档中不存在的字段$min阶段。

在本地测试时,随机选择的输出文档中的$groupdid_usage_timestamp

null

这会输出类似于:

的内容
<?php

require 'vendor/autoload.php';

$client = new MongoDB\Client;
$collection = $client->test->foo;

$collection->drop();
$collection->insertMany([
    ["did" => "123456", "did_usage" => "1", "did_timestamp" => "15012"],
    ["did" => "4567811", "did_usage" => "1", "did_timestamp" => "15013"],
    ["did" => "46465464", "did_usage" => "2", "did_timestamp" => "15014"],
    ["did" => "7894446", "did_usage" => "2", "did_timestamp" => "15015"],
    ["did" => "65646131", "did_usage" => "3", "did_timestamp" => "15016"],
    ["did" => "7989464", "did_usage" => "2", "did_timestamp" => "15017"],
]);

$cursor = $collection->aggregate([
    ['$group' => ['_id' => '$did_usage', 'did_timestamp' => ['$min' => '$did_usage_timestamp']]],
    ['$sample' => ['size' => 1]],
]);

var_dump($cursor->toArray());

array(1) { [0]=> object(MongoDB\Model\BSONDocument)#14 (1) { ["storage":"ArrayObject":private]=> array(2) { ["_id"]=> string(1) "1" ["did_timestamp"]=> NULL } } } 运营商的字段路径更改为$min可以解决问题。