Question

假设我有一堆像这样的文件：

for (int i = 0; i < data.length(); i++) {
    JSONObject c = data.getJSONObject(i);
    String weekId = null;
    weekId = c.getString("weekid"); //store this value in a String array or ArrayList
}

对于针对这些文档运行的查询，我正在寻找一种方法来返回{ "foo" : [1, 2, 3] } { "foo" : [3, 4, 5] }的所有值的数组（理想情况下是唯一值，但重复是正常的）：

foo

我已经查看了聚合API，但如果可能的话，我无法看到如何实现这一点。我当然可以在代码中手动编译结果，但是我可以拥有数千个文档，以这种方式获得结果会更加清晰。

Answer 1

您可以将Scripted Metric Aggregation与reduce_script一起使用。

设置一些测试数据：

curl -XPUT http://localhost:9200/testing/foo/1 -d '{ "foo" : [1, 2, 3] }'
curl -XPUT http://localhost:9200/testing/foo/2 -d '{ "foo" : [4, 5, 6] }'

现在尝试这个聚合：

curl -XGET "http://localhost:9200/testing/foo/_search" -d'
{
  "size": 0,
  "aggs": {
    "fooreduced": {
      "scripted_metric": {
        "init_script": "_agg[\"result\"] = []",
        "map_script":  "_agg.result.add(doc[\"foo\"].values)",
        "reduce_script": "reduced = []; for (a in _aggs) { for (entry in a) { word = entry.key; reduced += entry.value } }; return reduced.flatten().sort()"

      }
    }
  }
}'

电话会回复此信息：

{
  "took": 50,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "fooreduced": {
      "value": [
        1,
        2,
        3,
        4,
        5,
        6
      ]
    }
  }
}

有可能有一个解决方案与.flatten()，但我没有那么多groovy（还）找到这样的解决方案。而且我不能说这种聚合的表现有多好，你必须亲自测试它。

Elasticsearch - 组合来自多个文档的字段

1 个答案: