使用Elasticsearch的高亮功能:
"highlight": {
"fields": {
"tags": { "number_of_fragments": 0 }
}
}
使用number_of_fragments: 0
时,不会生成任何片段,但会返回该字段的全部内容。这对于短文本很有用,因为文档可以正常显示,人们可以轻松扫描突出显示的部分。
当文档包含具有多个值的数组时,如何使用它?
PUT /test/doc/1
{
"tags": [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
}
GET /test/doc/_search
{
"query": {
"match": { "tags": "hit"}
},
"highlight": {
"fields": {
"tags": { "number_of_fragments": 0 }
}
}
}
现在我想向用户展示:
1结果:
文件1,标记为:
“一个点击标记”,“两个foo标记”,“三个点击标记”,“四个foo标记”
不幸的是,这是查询的结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.10848885,
"hits": [
{
"_index": "test",
"_type": "doc",
"_id": "1",
"_score": 0.10848885,
"_source": {
"tags": [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
},
"highlight": {
"tags": [
"one <em>hit</em> tag",
"three <em>hit</em> tag"
]
}
}
]
}
}
我如何使用它来:
"tags": [
"one <em>hit</em> tag",
"two foo tag",
"three <em>hit</em> tag",
"four foo tag"
]
答案 0 :(得分:0)
一种可能性是从突出显示的字段中删除<em>
html标记。然后在原始字段中查找它们:
tags = [
"one hit tag",
"two foo tag",
"three hit tag",
"four foo tag"
]
highlighted = [
"one <em>hit</em> tag",
"three <em>hit</em> tag",
]
highlighted.each do |highlighted_tag|
if (index = tags.index(highlighted_tag.gsub(/<\/?em>/, '')))
tags[index] = highlighted_tag
end
end
puts tags #=>
# one <em>hit</em> tag
# two foo tag
# three <em>hit</em> tag
# four foo tag
这不会收到最漂亮代码的价格,但我认为它可以完成工作。