Question

是否可以为Elasticsearch中的存储桶聚合返回更复杂的键？默认值为使用字符串：

查询：

{
  "aggregations" : {
    "file.name" : {
      "terms" : {
        "field" : "file.name"
      },
      "aggregations" : {
        "level" : {
          "terms" : {
            "field" : "level"
          }
        }
      }
    }
  }
}

结果：

{
    "aggregations": {
        "file.name": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [{
                "key": "test-1.pdf",
                "doc_count": 3,
                "level": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [{
                        "key": "Warning",
                        "doc_count": 2
                    }, {
                        "key": "Error",
                        "doc_count": 1
                    }]
                }
            }]
        }
    }
}

在这里，我按照文件名存储了聚合。除了名称，我真的需要更多字段。例如，我想查看该文档的ID：

{
    "key": {
      "id": "1",
      "name": "test-1.pdf"
    },
    "doc_count": 3,
    "level": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [{
            "key": "Warning",
            "doc_count": 2
        }, {
            "key": "Error",
            "doc_count": 1
        }]
    }
}

我没有找到任何符合此要求的内容。我找到的最接近的是脚本指标，该指标可以让我将字段标记为字符串：

查询：

{
  "aggregations" : {
    "file" : {
      "terms" : {
        "script" : {
          "inline" : "doc['file.id'].value ? doc['file.id'].value + '|' + doc['file.name'].value  : null"
        },
        "size" : 1000
      },
      "aggregations" : {
        "level" : {
          "terms" : {
            "field" : "level"
          }
        }
      }
    }
  }
}

结果：

{
    "aggregations": {
        "file": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [{
                "key": "f-2|test-1.pdf",
                "doc_count": 3,
                "level": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [{
                        "key": "Warning",
                        "doc_count": 2
                    }, {
                        "key": "Error",
                        "doc_count": 1
                    }]
                }
            }]
        }
    }
}

我想这可以用，但是有点脏。我是否缺少更好的选择？我认为必须有其他一些创造性的解决方案来解决这个问题。

Answer 1

根据Val的评论发布可能的解决方案。它使用热门匹配子聚合将我需要的其他字段包含到响应中。

这是一个非常冗长的答复。我仍然需要对此进行负载测试，以查看性能如何。如果性能良好并且没有更好的解决方案，我会回过头并标记为已回答。

查询：

{
    "aggregations": {
        "file_id": {
            "terms": {
                "field": "file.id"
            },
            "aggregations": {
                "level": {
                    "terms": {
                        "field": "level"
                    }
                },
                "top_hits": {
                    "top_hits": {
                        "size": 1,
                        "fields": ["file.name", "file.path"]
                    }
                }
            }
        }
    }
}

响应：

{
    "aggregations": {
        "file_id": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [{
                "key": "f-2",
                "doc_count": 3,
                "level": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [{
                        "key": "Warning",
                        "doc_count": 2
                    }, {
                        "key": "Error",
                        "doc_count": 1
                    }]
                },
                "top_hits": {
                    "hits": {
                        "total": 3,
                        "max_score": 1.0,
                        "hits": [{
                            "_index": "logs-current",
                            "_type": "workspaceLog",
                            "_id": "log-5",
                            "_score": 1.0,
                            "fields": {
                                "file.name": ["test-1.pdf"]
                            }
                        }]
                    }
                }
            }]
        }
    }
}

Elasticsearch中存储桶聚合的复杂键

1 个答案: