考虑以下结果:
curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d
'{ "query" :
{"match":
{"last_name": "Smith"}
}
}'
结果:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.30685282,
"hits": [
{
"_index": "megacorp",
"_type": "employee",
"_id": "1",
"_score": 0.30685282,
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing on the weekends.",
"interests": [
"sports",
"music"
]
}
},
{
"_index": "megacorp",
"_type": "employee",
"_id": "2",
"_score": 0.30685282,
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}
}
]
}
}
现在,当我执行以下查询时:
curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d
'{ "query" :
{"fuzzy":
{"last_name":
{"value":"Smitt",
"fuzziness": 1
}
}
}
}'
尽管Levenshtein距离“Smith”和“Smitt”为1,否则返回NO结果。同样的结果是值为“Smit”。如果我输入fuzziness
值为2,我会得到结果。我在这里缺少什么?
答案 0 :(得分:1)
我假设您查询的last_name
字段是一个分析字符串。索引字词将为smith
而不是Smith
。
尽管Levenshtein距离为" Smith"和 " Smitt"是1。
fuzzy
查询不会分析术语,所以实际上,你的Levenshtein距离不是1而是2:
尝试使用此映射,您的fuzziness = 1的查询将起作用:
PUT /megacorp/employee/_mapping
{
"employee":{
"properties":{
"last_name":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
希望这有帮助