我正在研究弹性搜索,并面临一个问题,即在搜索记录时,一个单词连续出现多少次。 就像我有以下行:
{
{ "user":"Aniket", "postDate":"2016-04-26","body":"Search as we discuss yesterday one time word", "title":"One time word"}
},
{
"user": "aniket", "postDate": "2016-04-26", "body": "Distribution is hard. Distribution should be easy.word word word word" , "title": "Four times word"}
},
{"user": "aniket", "postDate": "2016-04-26", "body": "Distribution is hard. Distribution should be easy.word word word" , "title": "Three times word"}
},
{"user": "aniket",
"postDate": "2016-04-26",
"body": "Distribution is hard. Distribution should be easy.word word" ,
"title": "Two times word"
}
我在用户aniket下面有四行,每行都有“word”,但有时会有两次,三次,四次或一次。我需要结果,如果我搜索“word”,我们在结果中发现了四次,而不是它会在顶部出现:1。单词单词单词2.单词单词3.单词单词4.单词我尝试过分数也是如此,但分数不会向我提供与此相关的任何信息。
我正在尝试以下查询
curl -XGET 'localhost:9200/blog/post/_search?pretty=1' -d '{
"query": {
"match": {
"body": "word"
}
},
"sort": {
"_script": {
"type": "number",
"script": "termInfo=_index['body'][term].tf();return termInfo;",
"params": {
"term": "word"
},
"lang": "groovy",
"order": "desc"
}
}
}'
并收到此错误:
{
"index" : "blog",
"shard" : 4,
"status" : 500,
"reason" : "QueryPhaseExecutionException[[blog][4]: query[filtered(body:word)->cache(type:post)],from[0],size[10],sort[script\": org.elasticsearch.index.fielddata.fieldcomparator.DoubleScriptDataComparator$InnerSource@51c07776>!]: Query Failed [Failed to execute main query]]; nested: GroovyScriptExecutionException[MissingPropertyException[No such property: body for class: Script5]]; "
} ]
如果我删除查询的排序部分而不是给我结果,即使我使用简单的排序然后使用asc的主体和顺序也比它正常工作但不是我们的单词计数情况。任何解决方案和我缺少的东西?
答案 0 :(得分:0)
请按照我在sort result by term frequency count上的说明检查" _explanation "的内容:
"_explanation": {
"value": 0.16608895,
"description": "weight(_all:godfather in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.16608895,
"description": "fieldWeight in 0, product of:",
"details": [
{
"value": 1.7320508,
"description": "tf(freq=3.0), with freq of:",
"details": [
{
"value": 3,
"description": "termFreq=3.0",
"details": []
}
]
},
{
"value": 0.30685282,
"description": "idf(docFreq=1, maxDocs=1)",
"details": []
},
{
"value": 0.3125,
"description": "fieldNorm(doc=0)",
"details": []
}
]
}
]
}
有这个词的计数:
"value": 3,
"description": "termFreq=3.0",
"details": []