这是我的ES查询:
===创建索引===
PUT /sample
===插入数据===
PUT /sample/docs/1
{"data": "And the world said, 'Disarm, disclose, or face serious consequences'—and therefore, we worked with the world, we worked to make sure that Saddam Hussein heard the message of the world."}
PUT /sample/docs/2
{"data": "Never give in — never, never, never, never, in nothing great or small, large or petty, never give in except to convictions of honour and good sense. Never yield to force; never yield to the apparently overwhelming might of the enemy"}
===查询获得结果===
POST sample/docs/_search
{
"query": {
"match": {
"data": "never"
}
},
"highlight": {
"fields": {
"data":{}
}
}
}
===检索结果===
...
"highlight": {
"data": [
"<em>Never</em> give in — <em>never</em>, <em>never</em>, <em>never</em>, <em>never</em>, in nothing great or small, large or petty, <em>never</em> give",
" in except to convictions of honour and good sense. <em>Never</em> yield to force; <em>never</em> yield to the apparently overwhelming might of the enemy"
]
}
===所需结果===
所需期限按文档搜索的期限的频率 如下例所示
Doc Id: 2
Term Frequency :{
"never": 8
}
我尝试过Bucket Aggregation,Terms Aggregation和其他聚合,但我没有得到这个结果。
提前感谢您的帮助!
答案 0 :(得分:0)
您应该使用术语向量,它支持根据频率查询特定术语。
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html
在这种情况下,您的查询将是
GET /sample/docs/_termvectors
{
"doc": {
"data": "never"
},
"term_statistics" : true,
"field_statistics" : true,
"positions": false,
"offsets": false,
"filter" : {
"min_term_freq" : 8
}
}