我在内容中有一个HTML字符串,如:
"content": "<h3><a href=\"http://blog.local/page/%D8%A2%D8%B2%D8%A7%D8%AF\">The Matrix has you </a></h3>follow the white rabbit."
我使用"fragment_size" : 150
来控制突出显示的片段在字符中的大小,但它返回一个带有损坏的HTML标记的子字符串:
"highlight": {
"content": [
"/%D8%A2%D8%B2%D8%A7%D8%AF">The <em>Matrix</em> has"
]
}
如何在基于JSON的查询DSL中修复它?
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "matrix",
"fields": ["title","content"]
}
},
"filter": {
"term": { "content_type": "page" }
}
}
},
"highlight" : {
"order" : "score",
"fields" : {
"content" : {"fragment_size" : 150, "number_of_fragments" : 3}
}
}
}
这是一个示例回复:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.98773545,
"hits": [
{
"_index": "myindex",
"_type": "post",
"_id": "101",
"_score": 0.024953224,
"_source": {
"ID": 101,
"content_type": "page",
"date": "1999-02-18 14:32:21",
"title": "Wake up, Neo",
"content": "<h3><a href=\"http://blog.local/page/%D8%A2%D8%B2%D8%A7%D8%AF\">The Matrix has you </a></h3>follow the white rabbit."
},
"highlight": {
"content": [
"/%D8%A2%D8%B2%D8%A7%D8%AF">the <em>matrix</em> has"
]
}
}
]
}
}
答案 0 :(得分:0)
我没有尝试过,但我认为你应该在高亮部分指定encoder
html
。
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "matrix",
"fields": ["title","content"]
}
},
"filter": {
"term": { "content_type": "page" }
}
}
},
"highlight" : {
"order" : "score",
"fields" : {
"content" : {"fragment_size" : 150, "number_of_fragments" : 3}
},
"encoder": "html"
}
}
请参阅:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-highlighting.html