ElasticSearch _suggest查询区分大小写。希望它们不区分大小写

时间:2014-12-17 20:06:21

标签: elasticsearch

我目前正在使用此终端执行搜索并请求:

elasticserver.com/citysuggest/_suggest -d {
  "result": {
    "text": "Chicago",
    "completion": {
      "field": "autoCompleteName"
    }
}

这是我的索引映射:

{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 1,
        "index": {
            "mapper": {
                "dynamic": false
            }
        },
        "analysis": {
            "analyzer": {
                "str_search_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["standard", "str_delimiter", "asciifolding", "porter_stem"]
                },
                "str_index_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["standard", "str_delimiter", "asciifolding", "porter_stem"],
                    "char_filter": "html_strip"
                }
            },
            "filter": {
                "str_delimiter": {
                    "type": "word_delimiter",
                    "generate_word_parts": true,
                    "catenate_words": true,
                    "catenate_numbers": true,
                    "catenate_all": true,
                    "split_on_case_change": true,
                    "preserve_original": true,
                    "split_on_numerics": true,
                    "stem_english_possessive": true
                }
            }
        }
    },
    "mappings": {
        "city": {
            "_source": {
                "enabled": false
            },
            "dynamic": false,
            "properties": {
                "_all": {
                    "enabled": false
                },
                "autoCompleteName": {
                    "type": "completion",
                    "index_analyzer": "str_index_analyzer",
                    "search_analyzer": "str_search_analyzer"
                }
            }
        }
    }
}

当我搜索“芝加哥”时,它返回预期结果,因为它找到了芝加哥的匹配,但是,当我搜索“芝加哥”时,它不会返回任何内容。我不能为我的生活弄清楚我需要改变什么来使搜索不区分大小写。如果用户键入“ChiCAgO”,它应该返回我的芝加哥结果,而不是什么都没有。

为了测试我的分析仪,我运行了这个:

elasticserver.com/citysuggest/_analyze?text=ChicaGo&pretty

我看起来像是一个正确的标记值。

{
  "tokens": [
    {
      "token": "chicago",
      "start_offset": 0,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

1 个答案:

答案 0 :(得分:2)

您只需将lowercase token filter添加到分析仪。

 "analysis": {
     "analyzer": {
         "str_search_analyzer": {
             "tokenizer": "standard",
             "filter": ["standard", "str_delimiter", "asciifolding", "porter_stem", "lowercase"]
         },
         "str_index_analyzer": {
             "tokenizer": "standard",
             "filter": ["standard", "str_delimiter", "asciifolding", "porter_stem", "lowercase"],
             "char_filter": "html_strip"
         }
     },
     "filter": {
         "str_delimiter": {
             "type": "word_delimiter",
             "generate_word_parts": true,
             "catenate_words": true,
             "catenate_numbers": true,
             "catenate_all": true,
             "split_on_case_change": true,
             "preserve_original": true,
             "split_on_numerics": true,
             "stem_english_possessive": true
         }
     }
 }

您的测试用例有效,因为您没有指定分析器,请尝试:

curl -XGET 'localhost:9200/citysuggest/_analyze?analyzer=str_index_analyzer&text=ChicaGo&pretty'