查找弹性搜索中连续字符的次数

时间:2016-04-27 12:13:47

标签: elasticsearch

我正致力于弹性搜索,并面临一个问题,即在搜索记录时,一个单词连续出现多少次。

就像我有以下行:

{
 { "user":"Aniket", "postDate":"2016-04-26","body":"Search as we discuss yesterday one time word", "title":"One time word"}
    }, 
{
 "user": "aniket", "postDate": "2016-04-26", "body": "Distribution is hard. Distribution should be easy.word word word word" , "title": "Four times word"}
    }, 
{"user": "aniket", "postDate": "2016-04-26", "body": "Distribution is hard. Distribution should be easy.word word word" , "title": "Three times word"}
    }, 
{"user": "aniket", 
    "postDate": "2016-04-26", 
    "body": "Distribution is hard. Distribution should be easy.word word" ,
    "title": "Two times word"
}

我在用户aniket下面有四行,每行都有“word”,但有时会有两次,三次,四次或一次。 我需要结果,如果我搜索“单词”,我们在结果中发现了四次,而不是在顶部,如: 1.单词单词 2.单词单词 单词 4.字 我也尝试过得分,但得分并不能为我提供任何相关信息。

1 个答案:

答案 0 :(得分:1)

您需要脚本排序。像这样:

  "sort": {
    "_script": {
      "type": "number",
      "script": "termInfo=_index['body'][term].tf();return termInfo;",
      "params": {
        "term": "word"
      },
      "lang": "groovy",
      "order": "desc"
    }
  }

elasticsearch.yml文件中启用Groovy脚本:

script.groovy.sandbox.enabled: true

而且,您还需要使用合适的分析仪。对于您的情况,例如,使用standard分析器(默认分析器),您将无法在easy.word进行拆分。要使分类工作,您需要一个分析器,例如.分割。