我有一个像这样的文档结构:
"properties": {
"username": {"type":"keyword, "normalizer":"lc"},
"title": {"type":"text", "analyzer":"title_lc", "search_analyzer":"whitespace"},
"text": {"type":"text",analyzer":"simple"},
"tags": {"type":"keyword","normalizer":"lc"},
"references": {"type":nested", "properties": {
"name": {"type":"text", "analyzer":"name_lc", "search_analyzer":"whitespace"},
"text": {"type":"text", "analyzer":"simple"},
"tags": {"type":"keyword", "normalizer":"lc"},
}
}
然后我尝试了一个聚合,该聚合不带嵌套的聚合,而是带嵌套的聚合,例如:
{
"_source": false,
"aggs": {
"main_tag": {
"terms": {"field":"tags"}
},
"sub_tag": {
"nested": {"path": "references"},
"aggs":{
"sub_tag_nest":{
"terms": {"field":"references.tags"}
}
}
}
}
}
现在这给了我一些标签……
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1496,
"max_score": 1,
"hits": [
{
"_index": "index",
"_type": "doc",
"_id": "unique_id_1",
"_score": 1
},
…
{
"_index": "index",
"_type": "doc",
"_id": "unique_id_10",
"_score": 1
}
]
},
"aggregations": {
"main_tag": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "",
"doc_count": 1
},
{
"key": "one",
"doc_count": 1
},
{
"key": "Test1",
"doc_count": 1
}
]
},
"sub_tag": {
"doc_count": 25357,
"sub_tag_nest": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "any",
"doc_count": 1575
},
{
"key": "one",
"doc_count": 1
},
{
"key": "test",
"doc_count": 1
}
]
}
}
}
}
是否可以通过点击或完全不点击来获取?
如果有100、1000、100,000个标签,它会缩放吗?
除了文件计数之外,我还能得到更多细节吗? 像最理想的情况一样,是主要标签的文档ID列表,子标签的文档ID和名称的列表(我知道…不太可能)
我以为我正在使用复合子聚合,但是似乎不支持该操作:
{"_source": false, "aggs": {
"main_tag_c": {
"composite": {"sources": [{
"main_tag": {"terms": {"field": "tags"} }
}]},
"aggs": {"s_main_tag_c_id": {"terms": {"field": "_id"} } }
},
"sub_tag": {
"nested": {"path": "references"},
"aggs":{
"sub_tag_nest_c": {"composite": {"sources": [{
"sub_tag_nest":{"terms": {"field": "references.tags"} }
}]},
"aggs":{
"s_sub_tag_nest_c": {"composite": {"sources": [{
"sub_tag_nest_id":{"terms": {"field": "_id"} }
}, {
"sub_tag_nest_name":{"terms": {"field": "references.name"} }
}]} }
}
}
} }
但是,这告诉我组合不能具有父聚合。这太糟糕了,因为我不确定我还能如何在每个标签下同时列出ID和名称对。
它适用于顶层标签,尽管您在每个标签下只能看到有限的文档ID列表。该文档还暗示,术语将始终受到限制,并且将它们包装在复合词中应将所有关键字作为存储桶值。