以下是我的一些示例文档
doc1
{
"occassion" : "Birthday",
"dessert": "gingerbread"
}
DOC2
{
"occassion" : "Wedding",
"dessert": "friand"
}
doc3的
{
"occassion":"Bethrothal" ,
"dessert":"gingerbread"
}
当我给出简单的术语聚合时,在“甜点”字段上,我得到如下结果
"aggregations": {
"desserts": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "gingerbread",
"doc_count": 2
},
{
"key": "friand",
"doc_count": 1
}
]
}
}
}
但如果这里的问题是如果有很多文件而且我需要知道字段名称“甜点”下存在多少个独特的关键字,那么我需要花费大量时间来弄明白。是否有解决方法只能获得指定字段名称下的唯一术语数量?
答案 0 :(得分:2)
基数聚合似乎正是您所寻找的:https://www.elastic.co/guide/en/elasticsearch/guide/current/cardinality.html
查询:
{
"size" : 0,
"aggs" : {
"distinct_desserts" : {
"cardinality" : {
"field" : "dessert"
}
}
}
}
会返回这样的内容:
"aggregations": {
"distinct_desserts": {
"value": 2
}
}
答案 1 :(得分:0)
我建议使用更高的precision_threshold基数来获得准确的结果。
GET /cars/transactions/_search
{
"size" : 0,
"aggs" : {
"count_distinct_desserts" : {
"cardinality" : {
"field" : "dessert",
"precision_threshold" : 100
}
}
}
}