该文档包含以下关于pre_tags / post_tags设置的含糊不清的评论,该设置能够包含多对前/后标记:
使用快速矢量荧光笔可以有更多标签,而且 "重要性"订购。
有谁知道声明的确切含义是什么?
答案 0 :(得分:2)
花了一段时间,但通过使用ES 1.7和_head插件尝试不同的查询,我能够弄清楚多个前置和后置标签如何影响突出显示。
使用快速矢量荧光笔,您可以按“重要性”的顺序指定标签,这似乎意味着他们的订单和搜索字词的顺序应该匹配。对任何效果使用多个前置或后置标记需要在查询中使用多个字段。
给出索引
{
myindex: {
mappings: {
corpdocument: {
properties: {
createddate: {
type: "date",
format: "dateOptionalTime"
},
docbody: {
type: "string",
analyzer: "text_analyzer",
fields: {
exact: {
type: "string",
analyzer: "text_analyzer_exact"
}
}
},
modifieddate: {
type: "date",
format: "dateOptionalTime"
},
title: {
type: "string"
}
}
}
}
}
}
和搜索
POST locahost:9200/myindex/corpdocument/_search
{
"highlight": {
"pre_tags": ["|primary-highlight|",
"|secondary-highlight|",
"post_tags": ["|/primaryh-highlight|",
"|/secondary-highlight|",
"fields": {
"docbody.exact": {
"fragment_size": 150,
"number_of_fragments": 3
}
}
},
"_source": {
"exclude": ["docbody"]
},
"query": {
"bool": {
"should": [{
"match": {
"docbody.exact": {
"query": "foo"
}
}
},
{
"match": {
"docbody.exact": {
"query": "bar"
}
}
}
}
}
}
你可以得到像这样的结果
{
"took": 14,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 97,
"max_score": 0.48895144,
"hits": [{
"_index": "myindex",
"_type": "corpdocument",
"_id": "XFxxZWR0ZXN0ZG9jc1xTYW5kYm94XFNhbmRib3hBbGxcRGV4dGVyX2xpdFw3NS5kb2M=",
"_score": 0.48895144,
"_source": {
"createddate": "2010-11-02T00:00:00-05:00",
"modifieddate": "2007-09-04T00:00:00-05:00",
"_id": "XFxxZWR0ZXN0ZG9jc1xTYW5kYm94XFNhbmRib3hBbGxcRGV4dGVyX2xpdFw3NS5kb2M="
},
"highlight": {
"docbody.exact": ["Lorem ipsum dolor sit amet, consectetur adipiscing elit |primary-highlight|foo|/primary-highlight|Lorem ipsum dolor sit amet, consectetur adipiscing elit",
"Lorem ipsum dolor sit amet, consectetur adipiscing elit |secondary-highlight|bar|/secondary-highlight|TOTHE|primary-highlight|foo</span>|/primary-highlight|Lorem ipsum dolor sit amet, consectetur adipiscing elit",
"Lorem ipsum dolor sit amet, consectetur adipiscing elit |secondary-highlight|bar|/secondary-highlight| Lorem ipsum dolor sit amet, consectetur adipiscing elit |primary-highlight|Chief|/primary-highlight| Lorem ipsum dolor sit amet, consectetur adipiscing elit"]
}
},
...
]
}
}
哪个标记包含哪个匹配基于标记和搜索词的顺序。切换“foo”和“bar”的顺序,同时将其他所有内容保持不变将导致bar被包裹在主标记中并且foo被包装在辅助标记中。
从使用3个搜索词和2个标签的初步实验看来,第三个术语似乎包含在第一个标签而不是第二个标签中。添加第三个标记可以解决该问题,但需要重复次要标记n次以覆盖所有搜索项。
"highlight": {
"pre_tags": ["|primary-highlight|",
"|secondary-highlight|",
"|secondary-highlight|",
"post_tags": ["|/primaryh-highlight|",
"|/secondary-highlight|",
"|/secondary-highlight|",
"fields": {
"docbody.exact": {
"fragment_size": 150,
"number_of_fragments": 3
}
}
},
..."query": {
"bool": {
"should": [{
"match": {
"docbody.exact": {
"query": "foo"
}
}
},
{
"match": {
"docbody.exact": {
"query": "bar"
}
}
},
{
"match": {
"docbody.exact": {
"query": "baz"
}
}
}
}
}