如何突出ElasticSearch中关键字的频率?

时间:2016-02-26 19:29:38

标签: elasticsearch

让我们说,我正在搜索三个短语“Microsoft”,“Facebook”,“Google”。

如何让ES返回返回结果中每个术语的频率?

谢谢!

2 个答案:

答案 0 :(得分:0)

您可能正在寻找术语聚合:

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

基本上,它会根据找到的条件显示结果

答案 1 :(得分:0)

我认为explain API可能会对您有所帮助。如果您运行以下查询:

GET /your_index/your_type/_search
{   
    "explain": true, 
    "query" : {
        "match": {
           "company" : "Google"
        }
    }
}

结果可能是:

{
   "took": 9,
   "timed_out": false,
   "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
   },
   "hits": {
       "total": 1,
       "max_score": 11.7377,
       "hits": [
       {
            "_shard": 1,
            "_node": "n0eQxWrIIPYPlmcXA",
            "_index": "your_index",
            "_type": "your_type",
            "_id": "76991",
            "_score": 11.7377,
            "_source": {
            "company": "Google",
               "price": "2008"
            },
            "_explanation": {
                "value": 11.7377,
                "description": "weight(id:76991 in 6552) [PerFieldSimilarity], result of:",
             "details": [
              {
                 "value": 11.7377,
                 "description": "fieldWeight in 6552, product of:",
                 "details": [
                    {
                       "value": 1,
                       "description": "tf(freq=1.0), with freq of:",
                       "details": [
                          {
                             "value": 1,
                             "description": "termFreq=1.0"
                          }
                       ]
                    },
                    {
                       "value": 11.7377,
                       "description": "idf(docFreq=2, maxDocs=138180)"
                    },
                    {
                       "value": 1,
                       "description": "fieldNorm(doc=6552)"
                    }
                 ]
              }
           ]
        }
     }
   ]
 }
}

_explanation部分下,您可能会看到文档频率:

tf - 该术语在特定评分文档中重复的次数

idf - 该术语在所有文件中重复多少次(我猜你想要的是什么)

希望它有所帮助!