我正在尝试使用nGrams和同义词等功能,但我没有运气。
我关注this blog post。我已经尝试将映射和查询调整到我的数据,它只会匹配确切的术语。我还尝试使用this gist中文章中的确切数据,结果相同。
以下是映射:
{
"mappings": {
"item": {
"properties": {
"productName": {
"fields": {
"partial": {
"search_analyzer":"full_name",
"index_analyzer":"partial_name",
"type":"string"
},
"partial_back": {
"search_analyzer":"full_name",
"index_analyzer":"partial_name_back",
"type":"string"
},
"partial_middle": {
"search_analyzer":"full_name",
"index_analyzer":"partial_middle_name",
"type":"string"
},
"productName": {
"type":"string",
"analyzer":"full_name"
}
},
"type":"multi_field"
},
"productID": {
"type":"string",
"analyzer":"simple"
},
"warehouse": {
"type":"string",
"analyzer":"simple"
},
"vendor": {
"type":"string",
"analyzer":"simple"
},
"productDescription": {
"type":"string",
"analyzer":"full_name"
},
"categories": {
"type":"string",
"analyzer":"simple"
},
"stockLevel": {
"type":"integer",
"index":"not_analyzed"
},
"cost": {
"type":"float",
"index":"not_analyzed"
}
}
},
"settings": {
"analysis": {
"filter": {
"name_ngrams": {
"side":"front",
"max_gram":50,
"min_gram":2,
"type":"edgeNGram"
},
"name_ngrams_back": {
"side":"back",
"max_gram":50,
"min_gram":2,
"type":"edgeNGram"
},
"name_middle_ngrams": {
"type":"nGram",
"max_gram":50,
"min_gram":2
}
},
"analyzer": {
"full_name": {
"filter":[
"standard",
"lowercase",
"asciifolding"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_name": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_ngrams"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_name_back": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_ngrams_back"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_middle_name": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_middle_ngrams"
],
"type":"custom",
"tokenizer":"standard"
}
}
}
}
}
}
搜索查询(我删除了过滤器以尝试返回更多结果):
{
"size":20,
"from":0,
"sort":[
"_score"
],
"query": {
"bool": {
"should":[
{
"text": {
"productName": {
"boost":5,
"query":"test query",
"type":"phrase"
}
}
},
{
"text": {
"productName.partial": {
"boost":1,
"query":"test query"
}
}
},
{
"text": {
"productName.partial_middle": {
"boost":1,
"query":"test query"
}
}
},
{
"text": {
"productName.partial_back": {
"boost":1,
"query":"test query"
}
}
}
]
}
}
}
如果我从第一个bool查询中删除以下代码,请使用gist中的上述查询
"text":{
"productName":{
"boost":5,
"query":"test query",
"type":"phrase"
}
}
所以它不会返回直接匹配,无论我的搜索词是什么,我仍然没有返回任何结果。
我认为我遗漏了一些明显的东西,并且不知道其他相关信息是什么,所以请放轻松我。
答案 0 :(得分:5)
看起来我找到了问题的答案,盲目地复制和粘贴。我链接的博客文章似乎已过时,命令的JSON不再正常工作(但在发送命令时没有抛出错误)。
以下是创建我使用的索引的代码:
{
"settings": {
"analysis": {
"filter": {
"name_ngrams": {
"side":"front",
"max_gram":50,
"min_gram":2,
"type":"edgeNGram"
},
"name_ngrams_back": {
"side":"back",
"max_gram":50,
"min_gram":2,
"type":"edgeNGram"
},
"name_middle_ngrams": {
"type":"nGram",
"max_gram":50,
"min_gram":2
}
},
"analyzer": {
"full_name": {
"filter":[
"standard",
"lowercase",
"asciifolding"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_name": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_ngrams"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_name_back": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_ngrams_back"
],
"type":"custom",
"tokenizer":"standard"
},
"partial_middle_name": {
"filter":[
"standard",
"lowercase",
"asciifolding",
"name_middle_ngrams"
],
"type":"custom",
"tokenizer":"standard"
}
}
}
},
"mappings" : {
"product": {
"properties": {
"productName": {
"fields": {
"partial": {
"search_analyzer":"full_name",
"index_analyzer":"partial_name",
"type":"string"
},
"partial_back": {
"search_analyzer":"full_name",
"index_analyzer":"partial_name_back",
"type":"string"
},
"partial_middle": {
"search_analyzer":"full_name",
"index_analyzer":"partial_middle_name",
"type":"string"
},
"productName": {
"type":"string",
"analyzer":"full_name"
}
},
"type":"multi_field"
},
"productID": {
"type":"string",
"analyzer":"simple"
},
"warehouse": {
"type":"string",
"analyzer":"simple"
},
"vendor": {
"type":"string",
"analyzer":"simple"
},
"productDescription": {
"type":"string",
"analyzer":"full_name"
},
"categories": {
"type":"string",
"analyzer":"simple"
},
"stockLevel": {
"type":"integer",
"index":"not_analyzed"
},
"cost": {
"type":"float",
"index":"not_analyzed"
}
}
}
}
}
以下是我用来插入测试记录的代码(我用了3次稍微修改了数据)
{
"productName": "Thingey",
"productID": "asdfasef9816",
"warehouse": "usa",
"vendor": "Cool Things Inc",
"productDescription": "This is a cool gizmo",
"categories": "Cool Things",
"stockLevel": 6,
"cost": 15.31
}
最后是搜索查询的JSON。
{
"size":20,
"from":0,
"sort":[
"_score"
],
"query": {
"bool": {
"should":[
{
"text": {
"productName.partial": {
"boost":1,
"query":"ing"
}
}
},
{
"text": {
"productName.partial_middle": {
"boost":1,
"query":"ing"
}
}
},
{
"text": {
"productName.partial_back": {
"boost":1,
"query":"ing"
}
}
}
]
}
}
}
我必须做的关键更改是将设置从映射PUT移动到索引创建。我也在这里移动了初始映射定义,但它可以使用regular / index / item / _mapping PUT创建。
如果任何ElasticSearch专业人员希望为此问题的未来读者扩展此功能,请执行此操作。