$ group需要_id的数组,但$ out不会处理它

时间:2015-02-12 17:04:33

标签: mongodb aggregation-framework

我需要总结一个数组的出现次数。我需要将其输出到集合中,但是当我尝试使用$out关键字时,它会失败并且“无法使用数组用于_id \”

有没有办法将小组阶段_id字段的值投影到新密钥并创建新的_id

db.djnNews_filtered.aggregate([
{$unwind:"$processed_text.headline_trigrams"},
{$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
{$sort:{"num":-1}}
])

{ "_id" : [ "Reports", "First", "Quarter" ], "num" : 279 }
{ "_id" : [ "ST", "upside", "prevails" ], "num" : 167 }
{ "_id" : [ "First", "Quarter", "Results" ], "num" : 160 }
{ "_id" : [ "Announces", "First", "Quarter" ], "num" : 155 }


db.djnNews_filtered.aggregate([
{$unwind:"$processed_text.headline_trigrams"},
{$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
{$sort:{"num":-1}},
{$out:"new_collection"}
])

assert: command failed: {
    "errmsg" : "exception: insert for $out failed: { connectionId: 3, err: \"can't use an array for _id\", code: 2, n: 0, ok: 1.0 }",
    "code" : 16996,
    "ok" : 0
} : aggregate failed

1 个答案:

答案 0 :(得分:1)

在MongoDB中,您不能拥有一个_id的数组文档。

您可以简单地将$project数组放到另一个字段吗?

db.djnNews_filtered.aggregate([
  {$unwind:"$processed_text.headline_trigrams"},
  {$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
  {$sort:{"num":-1}},
  {$project: {trigram: "$_id", count: "$num"}},
  {$out:"new_collection"}
])

另外,在将文档插入集合之前,我不确定您的意图是什么。如果排序仅用于在您决定将数据添加到集合之前查看数据,则可能需要考虑删除该步骤。