我刚刚审核了这段视频 - https://www.youtube.com/watch?v=7FLXjgB0PQI,并提出了一个关于ElasticSearch分析器的问题。 我已经阅读了官方文档和其他一些关于分析和分析器的文章,我有点困惑。
例如,我有以下索引配置:
"settings" : {
"analysis" : {
"filter" : {
"autocomplete" : {
"type" : "edge_ngram",
"min_gram" : 1,
"max_gram" : 20
}
},
"analyzer" : {
"autocomplete" : {
"type" : "custom",
"tokenizer" : "standard",
"filter" : ["lowercase", "autocomplete"]
}
}
}
},
"mappings" : {
"user" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields" : {
"name" : {
"type" : "string",
"analyzer" : "standard"
},
"autocomplete" : {
"type" : "string",
"index_analyzer" : "autocomplete",
"search_analyzer" : "standard"
}
}
}
}
}
}
然后我单独执行搜索请求:
{
"match" : {
"name.autocomplete" : "john smi"
}
}
和此:
{
"match" : {
"name" : "john smi"
}
}
如果我理解正确,我必须看到相同的结果,因为在两种情况下ES都应该使用标准分析仪,但我得到了不同的结果。为什么呢?
更新
我在索引中收集了以下名字:“john smith”,“johnathan smith”。
答案 0 :(得分:0)
当我尝试你所拥有的东西时,我得到了相同的结果,包括所需的"包装"。所以首先我创建了一个索引:
curl -XPOST "http://localhost:9200/test_index/" -d'
{
"settings": {
"analysis": {
"filter": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete"
]
}
}
}
},
"mappings": {
"user": {
"properties": {
"name": {
"type": "multi_field",
"fields": {
"name": {
"type": "string",
"analyzer": "standard"
},
"autocomplete": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
}
}
}'
然后添加文档:
curl -XPUT "http://localhost:9200/test_index/user/1" -d'
{
"name": "John Smith"
}'
第一次搜索产生文档:
curl -XPOST "http://localhost:9200/test_index/user/_search" -d'
{
"query": {
"match": {
"name.autocomplete": "john smith"
}
}
}'
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2712221,
"hits": [
{
"_index": "test_index",
"_type": "user",
"_id": "1",
"_score": 0.2712221,
"_source": {
"name": "John Smith"
}
}
]
}
}
第二个也是如此:
curl -XPOST "http://localhost:9200/test_index/user/_search" -d'
{
"query": {
"match": {
"name": "john smith"
}
}
}'
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2712221,
"hits": [
{
"_index": "test_index",
"_type": "user",
"_id": "1",
"_score": 0.2712221,
"_source": {
"name": "John Smith"
}
}
]
}
}
您的设置还有其他与我在此处所做的不同吗?
以下是我用于此问题的代码:
http://sense.qbox.io/gist/4c8299be570c87f1179f70bfd780a7e9f8d40919