I want to implement an aggregation that only returns the documents whose frequency is above a certain threshold.
For instance, here is the aggregation to get all of the documents with their counts
AggregationBuilder aggregation = AggregationBuilders
.terms("agg").field("column_name");
so this gives me the counts of documents for each value in column_name
[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"},{"doc_count":23,"key":"val3"}]
now, lets say i dont want all of these documents. I only want those that have a doc_count
greater than 25
So the ideal result would be
[{"doc_count":30,"key":"val1"},{"doc_count":29,"key":"val2"}]
how do i apply such a filter to my aggregation? I was looking at FilterBuilders
and filter aggregations, but they are for applying filters on any values within the documents. For instance i can apply a filter to only get the documents where val1 == xza
for column_name
but that is not what i am looking for. I want to apply a threshold for the doc_cunt
values after the aggregation has been applied.
Is this possible? I am using elasticsearch java api version 1.7.2
答案 0 :(得分:1)
术语聚合有一个名为min_doc_count
的内置选项。有关它们的文档,请参阅here。我没有使用过Java API,但this example似乎在一个例子中使用了.minDocCount()
(ctrl-f'minDocCount')