弹性搜索中的自动完成功能

时间:2017-05-11 10:04:15

标签: elasticsearch autocomplete elasticsearch-5 kibana-5

我打算为电子商务网站制作一个基于弹性搜索的自动完成模块。我正在使用edge_ngram作为建议。我正在尝试这种配置。

**My index creation :**

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10,
          "token_chars": [
            "letter","digit"
          ]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "autocomplete",
          "search_analyzer": "autocomplete_search"
        }
      }
    }
  }
}

**Inserting Data**

PUT my_index/doc/1
{
  "title": "iphone s" 
}

PUT my_index/doc/9
{
  "title": "iphone ka" 
}

PUT my_index/doc/11
{
  "title": "iphone ka t" 
}

PUT my_index/doc/15
{
  "title": "iphone 6" 
}

PUT my_index/doc/14
{
  "title": "iphone 6 16GB" 
}

PUT my_index/doc/3
{
  "title": "iphone k" 
}

POST my_index/_refresh

POST my_index/_analyze
{
  "tokenizer": "autocomplete",
  "text": "iphone 6"
}

POST my_index/_analyze
{
  "analyzer": "pattern",
  "text": "iphone 6"
}

**Autocomplete suggestions**
When i am trying to find out closets match to iphone 6.It is not showing correct result.

GET my_index/_search
{
  "query": {
    "match": {
      "title": {
        "query": "iphone 6", 
        "operator": "and"
      }
    }
  }
}


**Above query yielding :**
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0.28582606,
    "hits": [
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "1",
        "_score": 0.28582606,
        "_source": {
          "title": "iphone s"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "9",
        "_score": 0.25811607,
        "_source": {
          "title": "iphone ka"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "14",
        "_score": 0.24257512,
        "_source": {
          "title": "iphone 6 16GB"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "3",
        "_score": 0.19100356,
        "_source": {
          "title": "iphone k"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "15",
        "_score": 0.1862728,
        "_source": {
          "title": "iphone 6"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "11",
        "_score": 0.16358379,
        "_source": {
          "title": "iphone ka t"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "2",
        "_score": 0.15861572,
        "_source": {
          "title": "iphone 5 s"
        }
      }
    ]
  }
}

但结果应该是:

     {
        "_index": "my_index",
        "_type": "doc",
        "_id": "15",
        "_score": 1,
        "_source": {
          "title": "iphone 6"
        }
      }

如果我遗漏了这件事,请告诉我,我是新手,所以不知道任何其他可能产生更好结果的方法。

1 个答案:

答案 0 :(得分:1)

您使用autocomplete_search作为search_analyzer。如果您查看使用您指定的搜索分析器分析文本的方式。

POST my_index/_analyze
{
 "analyzer": "autocomplete_search",
 "text": "iphone 6"
}

你会得到

 {
 "tokens": [
  {
     "token": "iphone",           ===> Only one token
     "start_offset": 0,
     "end_offset": 6,
     "type": "word",
     "position": 0
     }
   ]
 }

由于所有文档都在iphone中有此reverse index)令牌。所以返回所有文件

如果您想匹配所需的结果,可以使用索引时使用的相同分析器。

{
 "query": {
 "match": {
  "title": {
    "query": "iphone 6", 
    "operator": "and",
    "analyzer" : "autocomplete"
   }
  } 
 }
}