Question

I want to implement an aggregation that only returns the documents whose frequency is above a certain threshold.

For instance, here is the aggregation to get all of the documents with their counts

AggregationBuilder aggregation = AggregationBuilders
                .terms("agg").field("column_name");

so this gives me the counts of documents for each value in column_name

[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"},{"doc_count":23,"key":"val3"}]

now, lets say i dont want all of these documents. I only want those that have a doc_count greater than 25

So the ideal result would be

[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"}]

how do i apply such a filter to my aggregation? I was looking at FilterBuilders and filter aggregations, but they are for applying filters on any values within the documents. For instance i can apply a filter to only get the documents where val1 == xza for column_name

but that is not what i am looking for. I want to apply a threshold for the doc_cunt values after the aggregation has been applied.

Is this possible? I am using elasticsearch java api version 1.7.2

Answer 1

术语聚合有一个名为min_doc_count的内置选项。有关它们的文档，请参阅here。我没有使用过Java API，但this example似乎在一个例子中使用了.minDocCount()（ctrl-f'minDocCount'）

Elasticsearch Java API : Aggregation Filter for document counts

1 个答案: