Question

我正在使用弹性搜索，我遇到了问题。如果有人给我一个提示，我会非常感激。

我想分析一个字段＆＃34; name＆＃34;或＆＃34;描述＆＃34;它由不同的条目组成。例如有人想搜索萨拉。如果他进入SARA，SAra或sara。他应该能够得到萨拉。弹性搜索使用分析器，使所有东西都小写。

我想实现它不区分大小写而不管用户输入大写或小写名称，他/她应该得到结果。我使用ngram过滤器来搜索名称和小写，这使它不区分大小写。但我想确保一个人即使以大写或小写输入也能得到结果。

在弹性搜索中有没有办法做到这一点？

{"settings": {

        "analysis": {
            "filter": {
                "ngram_filter": {
                    "type": "ngram",
                    "min_gram": 1,
                    "max_gram": 80
                }
            },
            "analyzer": {
                "index_ngram": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": [ "ngram_filter", "lowercase" ]
                },

我附上了包含json示例和search.txt文件的example.js文件来解释我的问题。我希望我的问题现在会更清楚了。这是onedrive的链接，我保存了这两个文件。 https://1drv.ms/f/s!AsW4Pb3Y55Qjb34OtQI7qQotLzc

Answer 1

你使用ngram有什么特别的原因吗？ Elasticsearch在“查询”和您索引的文本上使用相同的分析器 - 除非明确指定了search_analyzer，正如@Adam在他的回答中所提到的那样。在您的情况下，使用带有小写过滤器的standard tokenizer

就足够了

我使用以下设置和映射创建了一个索引：

{
   "settings": {
      "analysis": {
         "analyzer": {
            "custom_analyzer": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "lowercase"
               ]
            }
         }
      }
   },
   "mappings": {
      "typehere": {
         "properties": {
            "name": {
               "type": "string",
               "analyzer": "custom_analyzer"
            },
            "description": {
               "type": "string",
               "analyzer": "custom_analyzer"
            }
         }
      }
   }
}

索引两份文件 Doc 1

PUT /test_index/test_mapping/1
    {
        "name" : "Sara Connor",
        "Description" : "My real name is Sarah Connor."
    }

Doc 2

PUT /test_index/test_mapping/2
    {
        "name" : "John Connor",
        "Description" : "I might save humanity someday."
    }

进行简单的搜索

POST /test_index/_search?query=sara
{
    "query" : {
        "match" : {
            "name" : "SARA"
        }
    }
}

只收回第一份文件。我尝试了“sara”和“Sara”，结果相同。

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.19178301,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_mapping",
        "_id": "1",
        "_score": 0.19178301,
        "_source": {
          "name": "Sara Connor",
          "Description": "My real name is Sarah Connor."
        }
      }
    ]
  }
}

Answer 2

对全文搜索字段（已分析）执行两次分析过程：首先存储数据，第二次搜索时。值得一提的是，输入JSON将以与搜索查询的输出相同的形状返回。分析过程仅用于创建倒排索引的标记。解决方案的关键是以下步骤：

使用ngram过滤器和第二个分析器创建两个分析器没有ngram过滤器，因为您不需要分析输入搜索使用ngram进行查询，因为您有一个要搜索的确切值。
为您的字段正确定义映射。有两个字段允许您指定分析器的映射。一个用于存储（分析器）和第二，用于搜索（search_analyzer） - 如果你只指定了分析器字段指定的分析器用于索引和搜索时间。

您可以在此处详细了解： https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html

你的代码看起来应该是这样的：

PUT /my_index
{
   "settings": {
      "analysis": {
         "filter": {
            "ngram_filter": {
               "type": "ngram",
               "min_gram": 1,
               "max_gram": 5
            }
         },
         "analyzer": {
            "index_store_ngram": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "ngram_filter",
                  "lowercase"
               ]
            }
         }
      }
   },
   "mappings": {
      "my_type": {
         "properties": {
            "name": {
               "type": "string",
               "analyzer": "index_store_ngram",
               "search_analyzer": "standard"
            }
         }
      }
   }
}

post /my_index/my_type/1
{
     "name": "Sara_11_01"
}

GET /my_index/my_type/_search
{
    "query": {
        "match": {
           "name": "sara"
        }
    }
}

GET /my_index/my_type/_search
{
    "query": {
        "match": {
           "name": "SARA"
        }
    }
}

GET /my_index/my_type/_search
{
    "query": {
        "match": {
           "name": "SaRa"
        }
    }
}

编辑1：问题中提供的新示例的更新代码

大小写不区分大小写的弹性搜索

2 个答案: