嗨我在elasticsearch v2.3中有一些亮点问题 我无法想出任何导致此问题的逻辑,这是两个例子:
这是我的疑问:
GET reports_all/all/_search
{
"query": {
"query_string": {
"fields": [
"text"
],
"query": "(\"base of the pyramid impact assessment\" OR \"corporate human rights benchmark\")"
// "query": "(\"corporate human rights benchmark\" OR \"base of the pyramid impact assessment\")"
}
},
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields": {
"text": {
"number_of_fragments": 10
}
}
},
"size": 10,
"from": 0
}
查看查询第二部分与OR分开的完全匹配。我只是交换第一个和第二个短语,这是第一个突出显示错误文本的结果:
"highlight": {
"text": [
" organisations to launch <mark>the</mark> \n<mark>Corporate</mark> <mark>Human</mark> <mark>Rights</mark> <mark>Benchmark</mark> (CHRB), <mark>the</mark> \nworld’s first wide-scale project to",
" taking \naction to reduce <mark>the</mark> environmental \n<mark>impact</mark> <mark>of</mark> our business and finding \nnew ways to help",
" focuses <mark>of</mark> this is reducing <mark>the</mark> <mark>impact</mark> <mark>of</mark> \nclimate change. Aviva Investors signed <mark>the</mark> Montreal Carbon",
" \nprogrammes in 2015\n</p>\n<p>Our 2015 reporting\nThis is <mark>the</mark> summary <mark>of</mark> our sustainable\nbusiness and corporate",
" aim to uphold <mark>the</mark> highest ethical \nstandards in <mark>the</mark> way that we do business. \nIn 2015, 98% <mark>of</mark> Aviva",
" costs to \nour customers\n</p>\n<p> Reducing our\nenvironmental <mark>impact</mark>\nIn 2015 Aviva became <mark>the</mark>",
" first insurer \nto achieve <mark>the</mark> Carbon Trust Supply Chain \nStandard, in recognition <mark>of</mark> work to measure",
" Stonewall’s \nTop 100 Employers list\n</p>\n<p>A principal partner \n<mark>of</mark> <mark>the</mark> Living Wage \nFoundation",
" take control <mark>of</mark> their finances, as\nwell as benefiting society and <mark>the</mark> environment\n</p>\n<p>• <mark>The</mark> way",
" we help our local communities, giving\nthousands <mark>of</mark> organisations <mark>the</mark> support they need\nto make a"
]
}
},
但第二个结果很好:
"highlight": {
"text": [
" organisations to launch the \n<mark>Corporate</mark> <mark>Human</mark> <mark>Rights</mark> <mark>Benchmark</mark> (CHRB), the \nworld’s first wide-scale project to"
]
}
知道可能出了什么问题?
答案 0 :(得分:0)
我不太确定发生了什么,但看起来您的查询被分析器分解成单独的单词,ES正在查询中添加隐式AND。
这就是为每个单词分别获得<mark>
突出显示的原因。
如果您希望ES将base of the pyramid impact assessment
视为单个实体,则可以使用match_phrase查询。
您的查询将类似于
"query": {
"bool": {
"should": [
{
"match_phrase": {
"text": "base of the pyramid impact assessment"
}},
{
"match_phrase": {
"text": "corporate human rights benchmark"
}
}
],
"minimum_number_should_match": 1
}
}
我不确定这是否有效。让我知道。