Question

我在MongoDB中有一组文档。

在带有mongoose的NodeJS中，我想计算每个单词出现的次数。结果应该是这样的：

[
    "latest": 2,
    "sprint": 2,
    "lair": 1,
    "laugh": 1,
    "fault": 1,
    "lemma": 2,
    "on": 1,
]

有人知道如何使用MongoDB聚合框架做到这一点吗？

我读到aggregation framework的性能更好，因为聚合是在服务器（C ++）中本地运行的，而mapReduce会生成单独的JavaScript线程来运行JavaScript代码。但是我从MongoDB开始，但是还没有找到一种方法来实现它。

Answer 1

自从我使用Mongo已经有一段时间了，但是希望这会有所帮助：

db.TestDocuments.aggregate([

  // Unwind each element of the array into its own document
  { $unwind: "$words" },

  // Group and count the total of each occurrence for each word
  { $group: { 
    _id: "$words" , 
    count: { "$sum": 1 }
  }},

  // Remove the id field from the response, rename it to the word
  { $project: { "_id": 0, "word": "$_id", "count": 1 } },

  // Sort the results with highest occurrences first
  { $sort: { "count": -1 } }
]);

结果如下：

{ "count" : 2, "word" : "latest" }
{ "count" : 2, "word" : "sprint" }
{ "count" : 2, "word" : "lemma" }
{ "count" : 1, "word" : "lair" }
{ "count" : 1, "word" : "laugh" }
{ "count" : 1, "word" : "fault" }
{ "count" : 1, "word" : "on" }

如何使用node.js和mongoose计算一组文档中的单词频率

1 个答案: