elasticsearch同义词不能按预期工作

时间:2016-08-16 07:02:58

标签: elasticsearch

我要搜索的文字是2 marina blvd,elasticsearch返回的结果(前3位)是:

2 MARINA GREEN, SINGAPORE 019800
MARINA BAYFRONT 2 RAFFLES LINK, SINGAPORE 039392
THE SAIL @ MARINA BAY 2 MARINA BOULEVARD, SINGAPORE 018987

在我的同义词列表中,blvdboulevard相同。

当我搜索2 marina blvd时,由于THE SAIL @ MARINA BAY 2 MARINA BOULEVARD, SINGAPORE 018987等于2 marina blvd,我预计此2 marina boulevard将成为评分最高的2 MARINA GREEN, SINGAPORE 019800。但现在{ "geolocation": { "settings": { "index": { "creation_date": "1471322099847", "analysis": { "filter": { "my_synonym_filter": { "type": "synonym", "synonyms": [ "rd,road", "ave,avenue", "blvd,boulevard", "st,street", "lor,lorong", "ter,terminal", "blk,block", "apt,apartment", "condo,condominium" ] } }, "analyzer": { "my_synonyms": { "filter": [ "lowercase", "my_synonym_filter" ], "tokenizer": "standard" }, "stopwords_analyzer": { "type": "standard", "stopwords": [ "the" ] }, "my_ngram_analyzer": { "tokenizer": "my_ngram_tokenizer" } }, "tokenizer": { "my_ngram_tokenizer": { "token_chars": [ "letter", "digit" ], "min_gram": "2", "type": "nGram", "max_gram": "5" } } }, "number_of_shards": "5", "number_of_replicas": "1", "uuid": "mPfZmWHFQZOHqfAi471nGQ", "version": { "created": "2030599" } } } } } 位居榜首。

出了什么问题,我该如何改善搜索结果?

完整设置为:

body: {
      from : 0, size : 10,
      query: {
        bool: {
          should: [
            {
              match: {
                text: q
              }
            },
            {
              match: {
                text: {
                  query: q,
                  fuzziness: 1,
                  prefix_length: 0,
                  max_expansions: 100
                }
              }
            },
            {
              match: {
                text: {
                  query: q,
                  max_expansions: 300,
                  type: "phrase_prefix"
                }
              }
            }
          ]
        }
      }
    }

这是查询

{
  "geolocation": {
    "mappings": {
      "location": {
        "properties": {
          "address": {
            "type": "string"
          },
          "blk": {
            "type": "string"
          },
          "building": {
            "type": "string"
          },
          "location": {
            "type": "geo_point"
          },
          "postalCode": {
            "type": "string"
          },
          "road": {
            "type": "string"
          },
          "searchText": {
            "type": "string"
          },
          "x": {
            "type": "string"
          },
          "y": {
            "type": "string"
          }
        }
      }
    }
  }
}

映射是:

<input class="domain flat ui-autocomplete-input" type="text" placeholder="Enter city, country or region" autocomplete="off"/>

1 个答案:

答案 0 :(得分:1)

您定义了分析器,但您尚未为您的字段设置任何分析器。 最基本的设置是:

"searchText": {
  "type": "string",
  "analyzer":"my_synon‌​yms"
}

一个字段可以有一个分析器用于索引时间,一个分析器用于搜索时间。大多数用例通常在索引和搜索时使用相同的分析器。默认情况下(使用"analyzer": "whatever_analyzer"‌​时),在搜索和索引时使用相同的分析器。

为了更深入地了解分析以及您可以做些什么,请咨询 https://www.elastic.co/guide/en/elasticsearch/guide/2.x/analysis-intro.html