Elasticsearch:聚合已知对象键(而不是值)

时间:2015-09-23 08:54:12

标签: search elasticsearch aggregate

我的Elasticsearch有一个索引包含这样的文档:

[{
  "_index": "products",
  "_type": "product",
  "_id": "100",
  "_score": 1,
  "_source": {
    "id": "100",
    "name": "Product 1",
    "catalogue": {
      "categories": {
        "cat1": ['h1', 'spin2'],
        "cat5": ['h2', 'spin2']
      }
    }
  }
},
{
  "_index": "products",
  "_type": "product",
  "_id": "100",
  "_score": 1,
  "_source": {
    "id": "100",
    "name": "Product 1",
    "catalogue": {
      "categories": {
        "cat2": ['d1', 'spin2'],
        "cat5": ['h2', 'spin2']
      }
    }
  }
}]

我需要汇总known categories。上述预期结果是:

"aggregations": {
  "categories": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
      {
        "key": "cat1",
        "doc_count": 1
      },
      {
        "key": "cat2",
        "doc_count": 1
      },
      {
        "key": "cat5",
        "doc_count": 2
      },
    ]
  }
}

我该如何定义我的搜索电话?

GET _search
{
  "aggregations": {
    "categories": {
      "terms": {
        ???
      }
    }
  }
}

更新 我应该使用如下script键。这可能会对性能产生影响,对吗?

GET _search
{
  "aggregations": {
    "categories": {
      "terms": {
        "script" : "????"
      }
    }
  }

1 个答案:

答案 0 :(得分:0)

你可以做这样的事情

GET /products/product/_search?search_type=count
{
  "aggs": {
    "cats": {
      "terms": {
        "script": "categories=_source.catalogue.categories;terms=[];for(categ in categories.keySet())terms+=categ;return terms"
      }
    }
  }
}

但是,是的,它会对性能产生影响。您需要对此进行测试并查看其行为方式。确保多次运行相同的查询,因为第一次可能需要更长的时间才能返回,这是正常的。