Question

我将推文保存到mongo DB：

 twit.stream('statuses/filter', {'track': ['animal']}, function(stream) {
    stream.on('data', function(data) {
        console.log(util.inspect(data));

        data.created_at = new Date(data.created_at);
        collectionAnimal.insert(data, function(err, docs) {});
    });
});

没关系。

MongoDB中的推文时间格式为：2014-04-25 11:45:14 GMT（column created_at列）现在我需要小时创建组列created_at。我想得到结果：

小时|以小时计算推文

1 | 28

2 | 26

3 | 32

4 | 42

5 | 36

...

我未成功的尝试：

    $keys = array('created_at' => true);
    $initial = array('count' => 0);
    $reduce = "function(doc, prev) { prev.count += 1 }";

    $tweetsGroup = $this->collectionAnimal->group( $keys, $initial, $reduce );

但是我不能按小时分组。

怎么做？

Answer 1

我可以告诉你如何在mongo控制台上直接使用聚合框架进行分组

db.tweets.aggregate(
 { "$project": {
      "y":{"$year":"$created_at"},
      "m":{"$month":"$created_at"},
      "d":{"$dayOfMonth":"$created_at"},
      "h":{"$hour":"$created_at"},
      "tweet":1 }
 },
 { "$group":{ 
       "_id": { "year":"$y","month":"$m","day":"$d","hour":"$h"},
       "total":{ "$sum": "$tweet"}
   }
 })

有关更多选项，请参阅此处：http://docs.mongodb.org/manual/reference/operator/aggregation-date/

您还需要找到使用您使用的任何编程语言的聚合框架的适当方法。

Answer 2

此处不应使用$project阶段，因为date operator函数可以在定义分组_id时直接在$group阶段使用。这节省了必须处理整个集合以获得结果：

另外，您只是在计算，只需{ "$sum" : 1 }，其中定义一个不存在的字段是导致0的问题。

    $this->collection->aggregate(array(
        array(
            '$group' => array(
                "_id" => array( 
                    "y" => array( '$year' => '$created_at' ),
                    "m" => array( '$month' => '$created_at' ),
                    "d" => array( '$dayOfMonth' => '$created_at' ),
                    "h" => array( '$hour' => '$created_at' ),
                ),
                "total" => array( '$sum' => 1 ),
            ),
        )
    ));

如果有的话，在管道的开头添加一个$match阶段以过滤日期。如果输出可以接受一天，那么您只需要在分组中定义$hour，并且您正在减少工作集大小，这意味着更快。也许你想要做什么。

Answer 3

拉利特的答案对我不起作用，它总是给我零。相反，我做了：

db.tweets.aggregate(
 { "$project": {
      "y":{"$year":"$created_at"},
      "m":{"$month":"$created_at"},
      "d":{"$dayOfMonth":"$created_at"},
      "h":{"$hour":"$created_at"},
      "tweet":1 }
 },
 { "$group":{ 
       "_id": { "year":"$y","month":"$m","day":"$d","hour":"$h"},
       'count':{$sum:1} 
   }
 })

'count':{$sum:1}是唯一的区别。

可能会帮助像我这样的mongo新手。

MongoDB按小时分组

3 个答案: