ElasticSearch:从multi_match获取不同的字段值

时间:2018-07-24 09:44:07

标签: elasticsearch

具有多个multi_matches的我的查询如下:

"query": {
   "bool": {
     "should" : [
       {"multi_match" : {
         "query": "test",
         "fields":     ["field1^15", "field2^8"],
         "tie_breaker": 0.2,
         "minimum_should_match": "50%"
       }},
       {"multi_match" : {
          "query": "test2",
          "fields":     ["field1^15", "field2^8"],
          "tie_breaker": 0.2,
          "minimum_should_match": "50%"
         }
        }
      ]
     }
    }

我想获取与查询匹配的所有不同的field1值。我怎么能知道呢?

编辑: 映射:

"field1": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "analyzer": "nGram_analyzer"
        }

这是我到目前为止尝试过的(我仍然获得多个相同的field1值):

"query": {
   "bool": {
     "should" : [
       {"multi_match" : {
         "query": "test",
         "fields":     ["field1^15", "field2^8"],
         "tie_breaker": 0.2,
         "minimum_should_match": "50%"
       }},
       {"multi_match" : {
          "query": "test2",
          "fields":     ["field1^15", "field2^8"],
          "tie_breaker": 0.2,
          "minimum_should_match": "50%"
         }
        }
      ]
     }
    },
"aggs": {
    "field1": {
      "terms": {
        "field": "field1.keyword",
        "size": 100 //1
      }
    }
  }

更新:

查询

    GET /test/test/_search
{
  "_source": ["field1"],
  "size": 10000,
  "query": {
                    "multi_match" : {
                      "query":      "test",
                      "fields":     ["field1^15", "field2^8"],
                      "tie_breaker": 0.2,
                      "minimum_should_match": "50%"
                    }
                },
  "aggs": {
    "field1": {
      "terms": {
        "field": "field1.keyword",
        "size": 1
      }
    }
  }
}

产生

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 10,
    "successful": 10,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 35,
    "max_score": 110.26815,
    "hits": [
      {
        "_index": "test",
        "_type": "test",
        "_id": "AVzz99c4X4ZbfhscNES7",
        "_score": 110.26815,
        "_source": {
          "field1": "test-hier"
        }
      },
      {
        "_index": "test",
        "_type": "test",
        "_id": "AVzz8JWGX4ZbfhscMwe_",
        "_score": 107.45808,
        "_source": {
          "field1": "test-hier"
        }
      },
      {
        "_index": "test",
        "_type": "test",
        "_id": "AVzz8JWGX4ZbfhscMwe_",
        "_score": 107.45808,
        "_source": {
          "field1": "test-da"
        }
      },
      ...

因此实际上应该只有一个“ test-hier”。

1 个答案:

答案 0 :(得分:0)

您可以在terms字段上添加field1.keyword聚合,然后会得到所有不同的值(您可以将大小更改为与字段的基数更匹配的任何其他值):

{
  "size": 0,
  "query": {...},
  "aggs": {
    "field1": {
      "terms": {
        "field": "field1.keyword",
        "size": 100
      },
      "aggs": {
        "single_hit": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}