Elasticsearch:每个关键字的前k个结果

时间:2017-12-12 14:05:05

标签: elasticsearch elasticsearch-5 elasticsearch-dsl

我们在elasticsearch中有以下文档。

class Query(DocType):
    text = Text(analyzer='snowball', fields={'raw': Keyword()})
    src = Keyword()

现在我们想要每个 src 的前k个结果。我们怎样才能做到这一点?

示例: - 假设我们索引以下内容:

# src: place_order
Query(text="I want to order food", src="place_order")
Query(text="Take my order", src="place_order")
...

# src: payment
Query(text="How to pay ?", src="payment")
Query(text="Do you accept credit card ?", src="payment")
...

现在,如果用户写了一个查询 请接受我的订单以及信用卡详细信息 k = 1 ,那么我们应该返回以下两个结果

[{"text": "Take my order", "src": "place_order", }, 
 {"text": "Do you accept credit card ?", "src": "payment"}
]

这里因为k = 1,我们只返回每个src的一个结果。

1 个答案:

答案 0 :(得分:2)

您可以尝试top hits聚合,这将在聚合中为每个存储桶返回前N个匹配文档。

对于帖子中的示例,查询可能如下所示:

POST queries/query/_search
{
  "query": {
    "match": {
      "text": "take my order please along with the credit card details"
    }
  },
  "aggs": {
    "src types": {
      "terms": {
        "field": "src"
      },
      "aggs": {
        "best hit": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

对文本查询的搜索会限制聚合的文档集。 "src types"聚合对匹配文档中找到的所有src值进行分组,"best hit"为每个桶选择一个最相关的文档(size参数可以根据您的需要进行更改。)< / p>

查询结果如下:

{
  "hits": {
    "total": 3,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "queries",
        "_type": "query",
        "_id": "VD7QVmABl04oXt2HGbGB",
        "_score": 1.3862944,
        "_source": {
          "text": "Do you accept credit card ?",
          "src": "payment"
        }
      },
      {
        "_index": "queries",
        "_type": "query",
        "_id": "Uj7PVmABl04oXt2HlLFI",
        "_score": 0.8630463,
        "_source": {
          "text": "Take my order",
          "src": "place_order"
        }
      },
      {
        "_index": "queries",
        "_type": "query",
        "_id": "UT7PVmABl04oXt2HKLFy",
        "_score": 0.6931472,
        "_source": {
          "text": "I want to order food",
          "src": "place_order"
        }
      }
    ]
  },
  "aggregations": {
    "src types": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "place_order",
          "doc_count": 2,
          "best hit": {
            "hits": {
              "total": 2,
              "max_score": 0.8630463,
              "hits": [
                {
                  "_index": "queries",
                  "_type": "query",
                  "_id": "Uj7PVmABl04oXt2HlLFI",
                  "_score": 0.8630463,
                  "_source": {
                    "text": "Take my order",
                    "src": "place_order"
                  }
                }
              ]
            }
          }
        },
        {
          "key": "payment",
          "doc_count": 1,
          "best hit": {
            "hits": {
              "total": 1,
              "max_score": 1.3862944,
              "hits": [
                {
                  "_index": "queries",
                  "_type": "query",
                  "_id": "VD7QVmABl04oXt2HGbGB",
                  "_score": 1.3862944,
                  "_source": {
                    "text": "Do you accept credit card ?",
                    "src": "payment"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

希望有所帮助!