Question

我正在尝试在2级嵌套中找到最小（最小）值（每个文档的单独最小值）。

到目前为止，我能够进行聚合，该聚合计算搜索结果中所有嵌套值的最小值，但每个文档没有分离。

我的示例架构：

class MyExample(DocType):
    myexample_id = Integer()
    nested1 = Nested(
        properties={
            'timestamp': Date(),
            'foo': Nested(
                properties={
                    'bar': Float(),
                }
            )
        }
    )
    nested2 = Nested(
        multi=False,
        properties={
            'x': String(),
            'y': String(),
        }
    )

这就是我在搜索和聚合的方式：

from elasticsearch_dsl import Search, Q

search = Search().filter(
    'nested', path='nested1', inner_hits={},
    query=Q(
        'range', **{
            'nested1.timestamp': {
                'gte': exampleDate1,
                'lte': exampleDate2
            }
        }
    )
).filter(
    'nested', path='nested2', inner_hits={'name': 'x'},
    query=Q(
        'term', **{
            'nested2.x': x
        }
    )
).filter(
    'nested', path='nested2', inner_hits={'name': 'y'},
    query=Q(
        'term', **{
            'nested2.y': y
        }
    )
)

search.aggs.bucket(
    'nested1', 'nested', path='nested1'
).bucket(
    'nested_foo', 'nested', path='nested1.foo'
).metric(
    'min_bar', 'min', field='nested1.foo.bar'
)

基本上我需要做的是获取每个唯一MyExample的所有嵌套nested1.foo.bar值的最小值（它们具有唯一的myexample_id字段）

Answer 1

如果您想要每个文档的最小值，那么将所有nested存储桶放在terms字段的myexample_id汇总中：

search.aggs..bucket(
  'docs', 'terms', field='myexample_id'
).bucket(
  'nested1', 'nested', path='nested1'
).bucket(
  'nested_foo', 'nested', path='nested1.foo'
).metric(
  'min_bar', 'min', field='nested1.foo.bar'
)

请注意，此聚合的计算成本可能非常高，因为它必须为每个文档创建一个存储桶。对于这样的用例，可能更容易计算每个文档的最小值script_field或应用程序。

Python elasticsearch DSL聚合/每个文档的嵌套值度量

1 个答案: