Question

我试图让我的mongodb集合可搜索。我可以在按文字索引集合后进行文本搜索

db.products.createIndex({title: 'text'})

我想知道是否可以检索此集合的所有索引术语列表。这对于自动完成和拼写检查/纠正非常有用。人们正在撰写搜索查询。

Answer 1

MongoDB中没有内置函数。但是，您可以使用聚合查询轻松获取此信息。

假设您的收藏包含以下文件：

{ "_id" : ObjectId("5874dbb1a1b342232b822827"), "title" : "title" }
{ "_id" : ObjectId("5874dbb8a1b342232b822828"), "title" : "new title" }
{ "_id" : ObjectId("5874dbbea1b342232b822829"), "title" : "hello world" }
{ "_id" : ObjectId("5874dbc6a1b342232b82282a"), "title" : "world title" }
{ "_id" : ObjectId("5874dbcaa1b342232b82282b"), "title" : "world meta" }
{ "_id" : ObjectId("5874dbcea1b342232b82282c"), "title" : "world meta title" }
{ "_id" : ObjectId("5874de7fa1b342232b82282e"), "title" : "something else" }

此查询将为我们提供有关单词的信息：

db.products.aggregate([
   {
      $project:{
         words:{
            $split:["$title"," "]
         }
      }
   },
   {
      $unwind:"$words"
   },
   {
      $group:{
         _id:"$words",
         count:{
            $sum:1
         }
      }
   },
   {
      $sort:{
         count:-1
      }
   }
])

输出每个单词的出现次数：

{ "_id" : "title", "count" : 4 }
{ "_id" : "world", "count" : 4 }
{ "_id" : "meta", "count" : 2 }
{ "_id" : "else", "count" : 1 }
{ "_id" : "something", "count" : 1 }
{ "_id" : "new", "count" : 1 }
{ "_id" : "hello", "count" : 1 }

如果您使用的是MongoDB 3.4，则可以使用新的collation选项对单词进行不区分大小写/变音符号不敏感的统计。

例如，我们假设我们的集合现在包含以下文档：

{ "_id" : ObjectId("5874e057a1b342232b82282f"), "title" : "title" }
{ "_id" : ObjectId("5874e05ea1b342232b822830"), "title" : "new Title" }
{ "_id" : ObjectId("5874e067a1b342232b822831"), "title" : "hello world" }
{ "_id" : ObjectId("5874e076a1b342232b822832"), "title" : "World Title" }
{ "_id" : ObjectId("5874e085a1b342232b822833"), "title" : "World méta" }
{ "_id" : ObjectId("5874e08ea1b342232b822834"), "title" : "World meta title" }
{ "_id" : ObjectId("5874e0aea1b342232b822835"), "title" : "something else" }

将collation选项添加到聚合查询中：

db.products.aggregate([
   {
      $project:{
         words:{
            $split:["$title"," "]
         }
      }
   },
   {
      $unwind:"$words"
   },
   {
      $group:{
         _id:"$words",
         count:{
            $sum:1
         }
      }
   },
   {
      $sort:{
         count:-1
      }
   }
],
{
   collation:{
      locale:"en_US",
      strength:1
   }
})

这将输出：

{ "_id" : "title", "count" : 4 }
{ "_id" : "world", "count" : 4 }
{ "_id" : "méta", "count" : 2 }
{ "_id" : "else", "count" : 1 }
{ "_id" : "something", "count" : 1 }
{ "_id" : "new", "count" : 1 }
{ "_id" : "hello", "count" : 1 }

强度是执行比较的级别：

 collation.strength: 1 // case insensitive + diacritic insensitive
 collation.strength: 2 // case insensitive only

Answer 2

如果我们假设 autoCompleteTerm 是您的输入值，您可以使用此查询获取标题列表：

db.products.distinct('title', { $text: { $search: autoCompleteTerm } } )

Answer 3

db.products.distinct("title")，您在寻找什么？

有没有办法查看Mongodb文本索引中的所有索引术语？

3 个答案: