Node.js + Elasticsearch:分析器在应用搜索查询时无效

时间:2015-08-10 20:56:23

标签: node.js elasticsearch mongoosastic

我正在尝试使用Elasticsearch来构建文本搜索,这是我第一次使用它时,我可能会误解许多概念。

当我编写任何索引字段中存在的完整单词时,搜索工作正常但是,我要做的是例如当我键入sam获取samsung的产品时我正在使用令牌分析器,在许多s sa sam sams等中使用该术语。 注意:我使用mongoosasticElasticsearch服务器配合使用。 这是产品型号,我称之为Item

var ItemSchema = new mongoose.Schema({
    title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
    price: Number,
    description: {type: String, es_indexed:true},
    picture: String,
    vendor: {type: String, es_indexed:true},
    vendorId: {type:String, es_indexed:true}
});

以下是我尝试使用analyzertokenizer的模型代码的其余部分:

    ItemSchema.plugin(mongoosastic, {
        hosts: [
        'localhost:9200'
        ]
    });

    var Item = mongoose.model('Item', ItemSchema);

    Item.createMapping({
"analysis" : {
    "filter": {
        "edgeNGram_filter": {
           "type": "edgeNGram",
           "min_gram": 2,
           "max_gram": 20,
           "side" : "front"
        }
     },
    "analyzer":{
        "edge_nGram_analyzer": {
            "type":"custom",
            "tokenizer":"edge_ngram_tokenizer",
            "filter": [
              "lowercase",
              "asciifolding",
              "edgeNGram_filter"
            ]
        },
        "whitespace_analyzer": {
            "type": "custom",
            "tokenizer": "whitespace",
            "filter": [
              "lowercase",
              "asciifolding"
           ]    
        }
    },
    "tokenizer" : {
        "edge_ngram_tokenizer" : {
          "type" : "edgeNGram",
          "min_gram" : "2",
          "max_gram" : "5",
          "token_chars": [ "letter", "digit" ]
        }   
    }
  }
    },function(err, mapping){
      // do neat things here
      if(err) {
        console.log(err);
      } 
      console.log(mapping);
    });

    module.exports = Item;

如果我在搜索框Item中输入title : cupcake(产品)cup我测试了这个,我什么都没有,但如果我写完整个关键字,我会得到该对象。< / p>

此外,我不想分析供应商ID和说明,我尝试这样做:vendorId: {type:String, index: 'not_analyzed'}但是,然后字段停止为搜索编制索引。

搜索endPoint的代码:

 app.post('/api/search', function(req, res, next) {
    Item.search({
      query_string: {
        query: req.body.keyword
      }
    },{hydrate:true}, function(err, results) {
      // results here
      res.send(results);
    });
 })

1 个答案:

答案 0 :(得分:0)

您需要指定要用于title字段的分析器。现在,您只是将每个字段编入索引进行搜索,但您没有将edge_nGram_analyzer应用于title字段。您可以使用mongoosastic es_analyzer属性来实现它,如下所示:

var ItemSchema = new mongoose.Schema({
    title: {type: String, es_indexed:true, es_analyzer: 'edge_nGram_analyzer'},
    price: Number,
    description: {type: String, es_indexed:true},
    picture: String,
    vendor: {type: String, es_indexed:true},
    vendorId: {type:String, es_indexed:true}
});

您的代码中还有另一个问题,即edge_nGram_analyzer未正确指定,您应该删除content部分,并将其设为:

"analyzer":{
    "edge_nGram_analyzer": {
        "type":"custom",
        "tokenizer":"edge_ngram_tokenizer",
        "filter": [
           "lowercase",
           "asciifolding",
           "edgeNGram_filter"
        ]
     },
     ...