突出显示弹性搜索,使结果仅匹配

时间:2019-01-03 18:07:49

标签: php elasticsearch

研究PHP和Elastic Search 6.5.2。对于测试场景,我使用了邮递员,当我应用“突出显示”时,当我在突出显示下添加两个或多个片段时,仅显示与关键字句子匹配的修剪后的内容。我不希望在弹性搜索级别上修剪内容。为此,我将片段数更改为零,并在弹性搜索中获得了预期的输出,它给出了全部内容,但是当我在php application中检查输出时,只要匹配关键字存在,整个内容就会变粗体。在网址字段中。

索引:

PUT test/_doc/1
{
  "title":"Apply For the admissions graduate and undergraduate"
  "url":"https://someurl.com/admissions",
  "content": "Engineers play an important role in almost every aspect of modern life. As an engineer in the 21st century, you’ll work in teams to develop ingenious ways to transform the world in which we live. Industrial engineers are in high demand in nearly every industry. Astounding innovations in semiconductor microelectronic engineering will continue to drive productivity and the economy by playing a key role in a wide range of technologies – information, communication, nanotechnology, defense, medicine, and energy.Admission into the microelectronic engineering program is competitive, but our admission process is a personal one. Each application is reviewed holistically for strength of academic preparation, performance on standardized tests, counselor recommendations, and your personal career interests. We seek applicants from a variety of geographical, social, cultural, economic, and ethnic backgrounds."
}

查询:

{
   "query":{
      "query_string":{
         "fields":[
            "content"
         ],
         "query":"admissions"
      }
   },
   "highlight":{
      "fields":{
         "title":{
            "pre_tags":[
               "<strong>"
            ],
            "post_tags":[
               "</strong>"
            ],
            "number_of_fragments":3            //changed to 0 earlier
         },
         "content":{
            "pre_tags":[
               "<strong>"
            ],
            "post_tags":[
               "</strong>"
            ],
            "fragment_size":150,
            "number_of_fragments":3           //changed to 0 earlier
         }
      }
   }
}

结果:

"highlight": {
          "content": [
            "information, communication, nanotechnology, defense, medicine, and energy.Admission into the microelectronic engineering program is competitive, but our <strong>admission</strong>"
          ]
        }

1 个答案:

答案 0 :(得分:0)

查看您通过注释共享的代码。这个问题不在弹性搜索响应中。

以下行引起了该问题:

<?php echo substr($r['highlight']['content'][0],0,300); ?>

由于这种情况之一,结果是缺少</b>标签,因为您只使用了前300个字符。

如果您发现可以看到,则html中的突出显示内容之一如下:

Tourism Management Curriculum Capstone/Exam/Thesis Options <b>Admissi</div>

您可以看到</b>缺少闭合<b>Admissi标签,因此此后的所有内容均为粗体。

解决方案可能不使用子字符串。