查询中的双通配符会导致普通/快速矢量elasticsearch荧光笔的奇怪突出显示

时间:2016-03-29 14:48:25

标签: elasticsearch highlighting

我正在研究elasticsearch 1.5.2

索引以下映射后:

PUT http://localhost:9200/index/_mapping/sometype
{
    "properties" : {
        "sometext" : {
            "type" : "string",
            "term_vector" : "with_positions_offsets"
        }
    }
}

和数据:

POST http://localhost:9200/index/sometype
{
    "sometext" : "A supervisor is responsible for the productivity and actions of a small group of employees. The supervisor has several manager-like roles, responsibilities, and powers. Two of the key differences between a supervisor and a manager are (1) the supervisor does not typically have hire and fire authority, and (2) the supervisor does not have budget authority."
}

用户正在尝试查找所有文档,但是他输入了一个通配符double:

POST http://localhost:9200/index/sometype/_search
{
    "query" : {
        "query_string" : {
            "query" : "**",
            "fields" : ["sometext"]
        }
    },
    "highlight" : {
        "pre_tags" : ["<em>"],
        "post_tags" : ["</em>"],
        "order" : "score",
        "require_field_match" : true,
        "fields" : {
            sometext : {
                "fragment_size" : 150,
                "number_of_fragments" : 1
            }
        }
    }
}

并得到以下亮点:

"highlight" : {
    "sometext" : ["responsibilities, <em>and</em> <em>powers</em>. <em>Two</em> <em>of</em> <em>the</em> <em>key</em> <em>differences</em> <em>between</em> <em>a</em> <em>supervisor</em> <em>and</em> <em>a</em> <em>manager</em> <em>are</em> (<em>1</em>) <em>the</em> <em>supervisor</em> <em>does</em> <em>not</em> <em>typically</em> <em>have</em> <em>hire</em> <em>and</em> <em>fire</em> <em>authority</em>, and"]
}

查询*?生成相同的突出显示结果 但是当查询只包含单个星号时 - 荧光笔没有返回任何内容。

在普通荧光笔上(我刚刚添加"type" : "plain"突出显示)结果看起来有点不同(但仍然很奇怪):

"highlight" : {
    "sometext" : [", <em>responsibilities</em>, <em>and</em> <em>powers</em>. <em>Two</em> <em>of</em> <em>the</em> <em>key</em> <em>differences</em> <em>between</em> <em>a</em> <em>supervisor</em> <em>and</em> <em>a</em> <em>manager</em> <em>are</em> (<em>1</em>) <em>the</em> <em>supervisor</em> <em>does</em> <em>not</em> <em>typically</em> <em>have</em> <em>hire</em> <em>and</em> <em>fire</em> <em>authority</em>, <em>and</em> (<em>2</em>) <em>the</em> <em>supervisor</em> <em>does</em> <em>not</em> <em>have</em> <em>budget</em> <em>authority</em>."]
}

有谁知道这种行为的原因是什么? 也许像***?这样的查询有一些特殊含义? 非常感谢。

2 个答案:

答案 0 :(得分:0)

答案 1 :(得分:0)

POST /index/sometype/_search
{
"query" : {
"query_string" : {`enter code here`
"query" : "**",
"fields" : ["sometext"]
}
},
"highlight" : {
"pre_tags" : ["<em>"],
"post_tags" : ["</em>"],
"order" : "score",
"require_field_match" : true,
"fields" : {
"sometext" : {
"fragment_size" : 180,
"number_of_fragments" : 1
}
}
}
}

:=>we can use this query