我无法按照https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-as-you-type.html
中的指南设置高亮显示search_as_you_type字段我将留下一系列命令来重现我所看到的内容。希望有人可以权衡我所缺少的东西:)
PUT /test_index
{
"mappings": {
"properties": {
"plain_text": {
"type": "search_as_you_type",
"index_options": "offsets",
"term_vector": "with_positions_offsets"
}
}
}
}
POST /test_index/_doc
{
"plain_text": "This is some random text"
}
GET /snippets_test/_search
{
"query": {
"multi_match": {
"query": "rand",
"type": "bool_prefix",
"fields": [
"plain_text",
"plain_text._2gram",
"plain_text._3gram",
"plain_text._index_prefix"
]
}
},
"highlight" : {
"fields" : [
{
"plain_text": {
"number_of_fragments": 1,
"no_match_size": 100
}
}
]
}
}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "rLZkjm8BDC17cLikXRbY",
"_score" : 1.0,
"_source" : {
"plain_text" : "This is some random text"
},
"highlight" : {
"plain_text" : [
"This is some random text"
]
}
}
]
}
}
我得到的回复没有我期望的突出显示
从概念上讲,重点是:This is some <em>ran</em>dom text
答案 0 :(得分:3)
要突出显示n-gram(字符),您需要:
min_gram
和max_gram
之间的最大差为1,因此在我的示例中,突出显示仅适用于长度为3或4的搜索字词。您可以更改此值并通过设置来创建更多n元语法index.max_ngram_diff
的值较高。这是配置:
{
"settings": {
"analysis": {
"analyzer": {
"partial_words" : {
"type": "custom",
"tokenizer": "ngrams",
"filter": ["lowercase"]
}
},
"tokenizer": {
"ngrams": {
"type": "ngram",
"min_gram": 3,
"max_gram": 4
}
}
}
},
"mappings": {
"properties": {
"plain_text": {
"type": "text",
"fields": {
"shingles": {
"type": "search_as_you_type"
},
"ngrams": {
"type": "text",
"analyzer": "partial_words",
"search_analyzer": "standard",
"term_vector": "with_positions_offsets"
}
}
}
}
}
}
查询:
{
"query": {
"multi_match": {
"query": "rand",
"type": "bool_prefix",
"fields": [
"plain_text.shingles",
"plain_text.shingles._2gram",
"plain_text.shingles._3gram",
"plain_text.shingles._index_prefix",
"plain_text.ngrams"
]
}
},
"highlight" : {
"fields" : [
{
"plain_text.ngrams": { }
}
]
}
}
和结果:
"hits": [
{
"_index": "test_index",
"_type": "_doc",
"_id": "FkHLVHABd_SGa-E-2FKI",
"_score": 2,
"_source": {
"plain_text": "This is some random text"
},
"highlight": {
"plain_text.ngrams": [
"This is some <em>rand</em>om text"
]
}
}
]
注意:在某些情况下,此配置对于内存使用和存储可能会很昂贵。