我正在实施名称搜索,其中可能的字段为middle_initial
,last_name
和cross_match
。查询通常是姓氏,例如"史密斯,A"在寻找" Smith,Ashley"而不是" A Smith"。
我的成绩得分不合理( Angela和Alex应该高于Robert和Ted ):
我在索引和查询方面都尝试了很多东西,我必须包含大量的模糊性(拼写和拼音)。通过n-gram分析器的_score
查询+一些模糊性满足了我的大多数需求,除此之外。修改:上面的列表按GET /_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial^5",
"last_name^10"
]
}
},
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial.phonetic^2",
"last_name.phonetic^5"
]
}
},
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial.analyzed^2",
"last_name.analyzed^10"
]
}
},
{
"bool": {
"should": [
{
"match": {
"last_name.word_start": {
"query": "smith, a",
"boost": 10,
"operator": "and",
"analyzer": "searchkick_word_search"
}
}
},
{
"match": {
"last_name.word_start": {
"query": "smith, a",
"boost": 5,
"operator": "and",
"analyzer": "searchkick_word_search",
"fuzziness": 1,
"prefix_length": 0,
"max_expansions": 3,
"fuzzy_transpositions": true
}
}
}
]
}
},
{
"bool": {
"should": [
{
"match": {
"first_name_middle_initial.word_start": {
"query": "smith, a",
"boost": 10,
"operator": "and",
"analyzer": "searchkick_word_search"
}
}
}
]
}
}
]
}
}
}
排序,因此我无法按其他方式排序。
查询示例,我试图查看是否索引第一个&中间名一起有所作为:
first_name
我也在提升,试图淹没中间初始匹配的任何东西,甚至不包括我的查询中的中间首字母或我在查询中引用的字段(例如这只是{{1}}。我不能完全忽略中间首字母,以防它是差异化字段。
答案 0 :(得分:1)
好吧,我的一个问题可能是过时索引。否则,键似乎使用ngram分析器作为我的cross_fields
匹配之一,并确保middle_initial
被认为是完全独立的(有点像一个决胜局)。将它放在bool
子查询中是有意的 - 我不希望它和该子句中的其他子查询被认为具有与cross_fields
匹配相同的权重,如{{3 }}
这里最终解决了我的问题:
索引映射:
{
<snip>
"first_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "dbl_metaphone"
},
"word_start": {
"type": "text",
"analyzer": "searchkick_word_start_index" // includes "lowercase", "asciifolding", "searchkick_edge_ngram" (ngram from the start of the word)
}
}
},
<snip>
"last_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "dbl_metaphone"
},
"word_start": {
"type": "text",
"analyzer": "searchkick_word_start_index"
}
}
},
"middle_initial": {
"type": "keyword",
"fields": {
"analyzed": {
"type": "text",
"analyzer": "searchkick_index" // includes lowercase, asciifolding, shingles, stemmer
}
},
"ignore_above": 30000
},
<snip>
}
}
}
<强>查询:强>
{
"query": {
"bool": {
"should": [
[
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name^2",
"last_name^3"
],
"tie_breaker": 0.3
}
},
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name.phonetic",
"last_name.phonetic"
],
"tie_breaker": 0.3
}
},
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name.word_start",
"last_name.word_start^2"
],
"tie_breaker": 0.3
}
}
],
{
"bool": {
"should": [
<snip subquery for another field>
{
"match": {
"middle_initial.analyzed": {
"query": "s",
"operator": "and"
}
}
}
]
}
}
]
}
}
}