我想在聚合查询后应用过滤器。例如,使用下面的聚合查询,我想只获得我们拥有所有窗口的那些条目。
注意:我们不必使用 include ,因为它使用正则表达式,这非常耗时,我们不能忽略这种情况。
查询:
GET /record_new/_search
{"size":0, "aggs" : {
"software_tags" : {
"terms" : {
"field" : "software_tags.keyword",
"size" : 100
}
}
}
}
响应:
{
"took": 77,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5706542,
"max_score": 0,
"hits": []
},
"aggregations": {
"software_tags": {
"doc_count_error_upper_bound": 5514,
"sum_other_doc_count": 581800,
"buckets": [
{
"key": "Microsoft Windows",
"doc_count": 70641
},
{
"key": "Bitcoin",
"doc_count": 35423
},
{
"key": "Linux",
"doc_count": 33230
},
{
"key": "ICQ",
"doc_count": 21934
},
{
"key": "PHP",
"doc_count": 20562
},
{
"key": "Windows XP",
"doc_count": 19720
},
{
"key": "Android (operating system)",
"doc_count": 17774
},
{
"key": "C++",
"doc_count": 14792
},
{
"key": "Pretty Good Privacy",
"doc_count": 14307
},
{
"key": "Tor (anonymity network)",
"doc_count": 14110
}
]
}
}
}
我也尝试过滤,但输出不正确。在输出中我们也得到 linux 。我不知道这里发生了什么。
GET /record_new/_search
{"size":0, "query": {
"constant_score": {
"filter":
{ "term": { "software_tags": "windows" }}
}
}, "aggs" : {
"software_tags" : {
"terms" : {
"field" : "software_tags.keyword",
"size" : 10
}
}
}
}
输出:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 93181,
"max_score": 0,
"hits": []
},
"aggregations": {
"software_tags": {
"doc_count_error_upper_bound": 1640,
"sum_other_doc_count": 171831,
"buckets": [
{
"key": "Microsoft Windows",
"doc_count": 70641
},
{
"key": "Windows XP",
"doc_count": 19720
},
{
"key": "Windows 7",
"doc_count": 12692
},
{
"key": "Linux",
"doc_count": 12311
},
{
"key": "Windows Vista",
"doc_count": 10172
},
{
"key": "Windows NT",
"doc_count": 5417
},
{
"key": "Windows Registry",
"doc_count": 5055
},
{
"key": "Windows 8",
"doc_count": 4829
},
{
"key": "Windows 2000",
"doc_count": 4738
},
{
"key": "Windows 10",
"doc_count": 4611
}
]
}
}
}
答案 0 :(得分:0)
尝试此查询,它应该在software_tag中查找带有窗口的记录:
{
"size":0,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "software_tags: *windows* AND NOT *linux* AND NOT *<next OS name to exclude>*",
"analyze_wildcard": true
}
}
]
}
}, "aggs" : {
"software_tags" : {
"terms" : {
"field" : "software_tags.keyword",
"size" : 10
}
}
}
}
它可能比通常的查询慢一点,但这是因为查询中存在通配符。