Elasticsearch:通过指定字段获取不同的记录

时间:2017-03-29 04:13:14

标签: elasticsearch

我想按字段搜索一些结果并按其他字段排序(" myscore"),这是ES(5.2.2)查询:

{
  "sort": [
    {"myscore": {"order" :"desc"}}
  ],
  "query": {
    "query_string" : {
       "query" : "(field1:foo) AND (field2:bar)"
    }
  }
}

然后,我可以得到这个:

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 3,
        "max_score": null,
        "hits": [
            {
                "_index": "fooindex",
                "_type": "footype",
                "_id": "1",
                "_score": null,
                "_source": {
                    "field1": "foo",
                    "field2": "bar",
                    "x_id": "x001",
                    "myscore": 0.9
                },
                "sort": [
                    0.9
                ]
            },
            {
                "_index": "fooindex",
                "_type": "footype",
                "_id": "2",
                "_score": null,
                "_source": {
                    "field1": "foo",
                    "field2": "bar",
                    "x_id": "x001",
                    "myscore": 0.8
                },
                "sort": [
                    0.8
                ]
            },
            {
                "_index": "fooindex",
                "_type": "footype",
                "_id": "3",
                "_score": null,
                "_source": {
                    "field1": "foo",
                    "field2": "bar",
                    "x_id": "x002",
                    "myscore": 0.7
                },
                "sort": [
                    0.7
                ]
            }
        ]
    }
}

但是,我希望根据字段" x_id" 获得截然不同的结果,如下所示:

{
    "_index": "fooindex",
    "_type": "footype",
    "_id": "1",
    "_score": null,
    "_source": {
        "field1": "foo",
        "field2": "bar",
        "x_id": "x001",
        "myscore": 0.9
    },
    "sort": [
        0.9
    ]
},
{
    "_index": "fooindex",
    "_type": "footype",
    "_id": "3",
    "_score": null,
    "_source": {
        "field1": "foo",
        "field2": "bar",
        "x_id": "x002",
        "myscore": 0.7
    },
    "sort": [
        0.7
    ]
}

类似的SQL将是"通过 x_id ";

从脚注组中选择*

我尝试过聚合:

"aggs": {
    "unique_xid": {
     "terms": {
       "field": "x_id"
     }
    }
},

结果将是:

"aggregations": {
   "unique_ids": {
      "buckets": [
         {
            "key": "x001",
            "doc_count": 2
         },
         {
            "key": "x002",
            "doc_count": 1
         }
      ]
   }
}

问题是聚合会导致泄漏字段信息,并按" count"排序。不是" myscore"。有没有办法通过指定的字段获得不同的结果?

1 个答案:

答案 0 :(得分:0)

由于您的存储桶可能包含多个文档,并且您希望在这些文档中使用最大值myscore来对存储桶进行排序,因此您可以这样做:

"aggs": {
   "unique_xid": {
      "terms": {
         "field": "x_id",
         "order": {
            "score": "desc"
         }
      },
      "aggs": {
         "score": {
            "max": {
               "field": "myscore"
            }
         }
      }
   }
},