在弹性搜索查询中应用过滤器

时间:2018-01-31 09:10:59

标签: elasticsearch kibana

我想在聚合查询后应用过滤器。例如,使用下面的聚合查询,我想只获得我们拥有所有窗口的那些条目。

注意:我们不必使用 include ,因为它使用正则表达式,这非常耗时,我们不能忽略这种情况。

查询:

GET /record_new/_search
{"size":0, "aggs" : {
        "software_tags" : {
            "terms" : {

                "field" : "software_tags.keyword",
                  "size" : 100


            }
        }
    }
}

响应:

{
  "took": 77,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5706542,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "software_tags": {
      "doc_count_error_upper_bound": 5514,
      "sum_other_doc_count": 581800,
      "buckets": [
        {
          "key": "Microsoft Windows",
          "doc_count": 70641
        },
        {
          "key": "Bitcoin",
          "doc_count": 35423
        },
        {
          "key": "Linux",
          "doc_count": 33230
        },
        {
          "key": "ICQ",
          "doc_count": 21934
        },
        {
          "key": "PHP",
          "doc_count": 20562
        },
        {
          "key": "Windows XP",
          "doc_count": 19720
        },
        {
          "key": "Android (operating system)",
          "doc_count": 17774
        },
        {
          "key": "C++",
          "doc_count": 14792
        },
        {
          "key": "Pretty Good Privacy",
          "doc_count": 14307
        },
        {
          "key": "Tor (anonymity network)",
          "doc_count": 14110
        }
      ]
    }
  }
}

我也尝试过滤,但输出不正确。在输出中我们也得到 linux 。我不知道这里发生了什么。

GET /record_new/_search
{"size":0, "query": {
    "constant_score": {
      "filter": 
        { "term": { "software_tags": "windows"   }}

    }
  }, "aggs" : {
        "software_tags" : {
            "terms" : {

                "field" : "software_tags.keyword",
                  "size" : 10


            }
        }
    }
}

输出:

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 93181,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "software_tags": {
      "doc_count_error_upper_bound": 1640,
      "sum_other_doc_count": 171831,
      "buckets": [
        {
          "key": "Microsoft Windows",
          "doc_count": 70641
        },
        {
          "key": "Windows XP",
          "doc_count": 19720
        },
        {
          "key": "Windows 7",
          "doc_count": 12692
        },
        {
          "key": "Linux",
          "doc_count": 12311
        },
        {
          "key": "Windows Vista",
          "doc_count": 10172
        },
        {
          "key": "Windows NT",
          "doc_count": 5417
        },
        {
          "key": "Windows Registry",
          "doc_count": 5055
        },
        {
          "key": "Windows 8",
          "doc_count": 4829
        },
        {
          "key": "Windows 2000",
          "doc_count": 4738
        },
        {
          "key": "Windows 10",
          "doc_count": 4611
        }
      ]
    }
  }
}

1 个答案:

答案 0 :(得分:0)

尝试此查询,它应该在software_tag中查找带有窗口的记录:

{
  "size":0,
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "software_tags: *windows* AND NOT *linux* AND NOT *<next OS name to exclude>*",
            "analyze_wildcard": true
          }
        }
      ]
    }
  }, "aggs" : {
        "software_tags" : {
            "terms" : {

                "field" : "software_tags.keyword",
                  "size" : 10


            }
        }
    }
}

它可能比通常的查询慢一点,但这是因为查询中存在通配符。