MongoDB [4.2] $ text搜索未返回预期结果

时间:2020-10-07 16:44:13

标签: mongodb

我们有一个作者集合,其中包含所有作者的作者信息。我们使用以下内容创建了文本索引

db.getCollection('contributors').createIndex(
  {
    display_name:"text",
    first_name: "text",
    last_name: "text"      
  },
  {
     weights: {
       display_name: 10,
       first_name: 5,
       last_name:5
     },      
    name: "Contributor_FTS_Index"
  }
)

这是我们的样本数据

{
    "_id" : ObjectId("5eac8232eb5aca201f104bfb"),
    "firebrand_id" : 54529588,
    "agents" : null,
    "created" : ISODate("2020-05-01T20:10:26.762Z"),
    "display_name" : "Grace Octavia",
    "email" : null,
    "estates" : null,
    "first_name" : "Grace",
    "item_type" : "Contributor",
    "last_name" : "Octavia",
    "phone" : null,
    "role" : 1,
    "short_bio" : "GRACE OCTAVIA is the author of unforgettable novels that deal with the trials and tribulations of love, friendship, and what it means to be true to yourself. Her second novel, His First Wife, graced the Essence® bestseller list and also won the Best African-American Fiction Award from RT Book Reviews. A native of Westbury, NY, she now resides in Atlanta, GA, where there is never any shortage of material on heartache and scandal. Grace earned a doctorate in English, Creative Writing at Georgia State University in Atlanta and currently teaches at Spelman College. Visit her online at GraceOctavia.net or follow her on Twitter @GraceOctavia2.",
    "slug" : "grace-octavia",
    "updated" : ISODate("2020-08-05T10:10:27.691Z"),
    "deleted" : false
}

{
    "_id" : ObjectId("5ada44aa2ad4b3e3d0ae3daf"),
    "item_type" : "Contributor",
    "role" : 1,
    "short_bio" : "",
    "firebrand_id" : 41529135,
    "display_name" : "Grace  Octavia",
    "first_name" : "Grace",
    "last_name" : "Octavia",
    "slug" : "grace-octavia",
    "updated" : ISODate("2020-09-22T16:19:57.319Z"),
    "agents" : null,
    "estates" : null,
    "deleted" : false,
    "email" : null,
    "phone" : null
}


{
    "_id" : ObjectId("58e6ee27afbe421347a11834"),
    "item_type" : "Contributor",
    "role" : 1,
    "short_bio" : "Octavia E. Butler (1947–2006) was a bestselling and award-winning author, considered one of the best science fiction writers of her generation. She received both the Hugo and Nebula awards, and in 1995 became the first author of science fiction to receive a MacArthur Fellowship. She was also awarded the prestigious PEN Lifetime Achievement Award in 2000. Her first novel, <i>Patternmaster</i> (1976), was praised both for its imaginative vision and for Butler’s powerful prose, and spawned four prequels, beginning with <i>Mind of My Mind</i> (1977) and finishing with <i>Clay’s Ark</i> (1984).<br /><br /> Although the Patternist series established Butler among the science fiction elite, it was <i>Kindred</i> (1979), a story of a black woman who travels back in time to the antebellum South, that brought her mainstream success. In 1985, Butler won Nebula and Hugo awards for the novella “Bloodchild,” and in 1987 she published <i>Dawn</i>, the first novel of the Xenogenesis trilogy, about a race of aliens who visit earth to save humanity from itself. <i>Fledgling</i> (2005) was Butler’s final novel. She died at her home in 2006.",
    "firebrand_id" : 11532005,
    "display_name" : "Octavia E. Butler",
    "first_name" : "Octavia",
    "last_name" : "Butler",
    "slug" : "octavia-e-butler",
    "updated" : ISODate("2020-09-23T04:06:18.857Z"),
    "image" : "https://s3.amazonaws.com/orim-book-contributors/11532005-book-contributor.jpg",
    "agents" : [ 
        {
            "name" : "Heifetz, Merrilee",
            "primaryemail" : "mheifetz@writershouse.com",
            "primaryphone" : "212-685-2605"
        }
    ],
    "estates" : [ 
        {
            "name" : "Estate of Octavia E. Butler",
            "primaryemail" : "",
            "primaryphone" : ""
        }
    ],
    "deleted" : false,
    "email" : null,
    "phone" : null
}

当我们尝试执行以下操作时;

db.getCollection('contributors').find({ $text: { $search: "oct" }})

它不返回任何文档。但是如果搜索

db.getCollection('contributors').find({ $text: { $search: "octavia" }})

它返回所有文档。

我们的要求是根据用户输入的搜索词给出搜索结果。因此可以是oc,oct,octav

2 个答案:

答案 0 :(得分:0)

您选择了错误的工具。 mongo中的文本搜索使用整个单词。在https://docs.mongodb.com/manual/core/index-text/#tokenization-delimiters

上了解有关mongo tokenizer的更多信息。

部分词索引需要ngram标记器。它在全功能文本引擎中可用。例如。基于Apache Lucene:ElasticSearch,Solr,Mongo Atlas等。

如果数据库相对较小并且权重不是必需的,则可以使用regexp:

db.contributors.find({
  "$or": [
    {
      displayname: {
        $regex: "oct",
        $options: "i"
      }
    },
    {
      first_name: {
        $regex: "oct",
        $options: "i"
      }
    },
    {
      last_mname: {
        $regex: "oct",
        $options: "i"
      }
    }
  ]
})

答案 1 :(得分:0)

使用这种搜索类型的Populer方式,而不是$ text,所以请尝试这样,

db.contributors.find({
  "$or": [
    {
      display_name: {
        $regex: "oct",
        $options: "i"
      }
    }
 // add more fields objects same as above 
]

});