以下是映射和分析器设置。假设我正在索引“书”记录。书籍记录上的多个字段(例如,出版商和标签)是字符串的数组(例如,[“随机房屋”,“macmillan”]),字段“名称”采用诸如“蓝色”的单个字符串。
{
"state": "open",
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "autocomplete_index",
"creation_date": "1509080632268",
"analysis": {
"filter": {
"edge_ngram": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "1",
"type": "edgeNGram",
"max_gram": "15"
},
"english_stemmer": {
"name": "possessive_english",
"type": "stemmer"
}
},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"english_stemmer"
],
"type": "custom",
"tokenizer": "standard"
},
"autocomplete_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"english_stemmer",
"edge_ngram"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1",
"uuid": "SSTzdTNFStaSiIBu-l3q5w",
"version": {
"created": "5060299"
}
}
},
"mappings": {
"autocomplete_mapping": {
"properties": {
"publishers": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"tags": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
},
"aliases": [],
"primary_terms": {
"0": 1,
"1": 1,
"2": 1,
"3": 1,
"4": 1
},
"in_sync_allocations": {
"0": [
"GXwYiYuWQ16wgxCrpXShJQ"
],
"1": [
"Do_49lZ4QmyNEYUK_QJfEQ"
],
"2": [
"vWZ_PjsLSGSVh130C5EvYQ"
],
"3": [
"5CLINaFJQbqVcZLVOsSNWQ"
],
"4": [
"hy3JYfmuR7e8fc-anu-heA"
]
}
}
如果我执行查询,例如:
curl -XGET 'localhost:9200/autocomplete_index/_search?size=5' -d '
{
"query" : {
"multi_match" : {
"query": "b",
"analyzer": "keyword",
"fields": ["_all"]
}
}
}'
我得到0结果。我必须在查询字段中输入完整的单词“blue”才能得到匹配。
此外,当我进行“_analyze”时,我得到:
curl -XGET 'localhost:9200/products_autocomplete_dev/_analyze?pretty' -H 'Content-Type: application/json' -d'
{
"analyzer": "autocomplete_analyzer",
"field": "name",
"text": "b"
}
'
{
"tokens" : [
{
"token" : "b",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}
我希望至少可以获得诸如“b”,“bl”,“blu”和“blue”等令牌。
以下是索引中的示例文档:
{
"_index" : "autocomplete_index",
"_type" : "autocomplete_mapping",
"_id" : "145",
"_version" : 1,
"found" : true,
"_source" : {
"name": "Blue",
"publishers" : [
"macmillan",
"Penguin"
],
"themes" : [
"Butterflies", "Mammals"
]
}
}
我做错了什么?
答案 0 :(得分:0)
有这么多错误的东西,我建议你仔细阅读有关分析仪的文档。希望你不要介意我这样做。
首先,如果您想测试分析仪,也不要指定字段名称,只需指定文本和分析仪本身:
GET /my_index/_analyze?pretty
{
"analyzer": "autocomplete_analyzer",
"text": "blue"
}
如果您定义了自定义分析器,Elasticsearch应该如何知道特定字段正在使用该分析器?定义分析器与使用它的特定字段不同。所以:
"name": {
"type": "text",
--> "analyzer": "autocomplete_analyzer",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
_all
字段也是如此:默认情况下,它使用standard
分析器,除非您更改它,否则它将使用相同的内容:
"mappings": {
"autocomplete_mapping": {
"_all": {
"analyzer": "autocomplete_analyzer"
},
"properties": {
"publishers": {
"type": "text",
"fields": {
.....