我有以下文件:
south africa
north africa
我想从以下地址检索我的“南非”文件:
s africa
(a)southafrica
(b)safrica
(c)我定义了以下过滤器和分析器:
POST test_index
{
"settings": {
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"south,s",
"north,n"
]
},
"shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3,
"token_separator": ""
}
},
"analyzer": {
"my_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter"]
},
"my_shingle_synonym": {
"type": "custom",
"tokenizer": "standard",
"filter": ["shingle_filter", "synonym_filter"]
},
"my_synonym_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": ["synonym_filter", "shingle_filter"]
}
}
}
},
"mappings": {}
}
1) my_shingle south africa
将被编入索引为south
,southafrica
,africa
2) my_shingle_synonym south africa
将被编入索引为south
,s
,southafrica
,africa
3) my_synonym_shingle south africa
将被编入索引为south
,souths
,southsafrica
,s
,{{1 },safrica
所以用
(1)我会找到b
(2)我会找到a,b
(3)我会找到一个,c
我希望将africa
编入索引为:south africa
,south
,s
,southafrica
,safrica
答案 0 :(得分:1)
您不必须根据您的要求输出所有可能的令牌。您可以通过在multi fields上使用不同的分析器来解决您的问题。
您可以像这样定义所需字段的mapping
。
"mappings": {
"your_mapping": {
"properties": {
"name": {
"type": "string",
"analyzer": "my_shingle",
"fields": {
"synonym": {
"type": "string",
"analyzer": "my_synonym_shingle"
}
}
}
}
}
}
索引的样本文件
PUT test_index/your_mapping/1
{
"name" : "south africa"
}
然后您将使用wildcard expression查询名称字段的所有变体。
GET test_index/your_mapping/_search
{
"query": {
"query_string": {
"fields": [
"name*"
],
"query": "safrica"
}
}
}