如何在Elasticsearch中获取重复的字段值及其计数

时间:2019-09-19 09:06:41

标签: elasticsearch elasticsearch-aggregation

我有一个学校项目,在其中使用ELK堆栈。

我有很多数据,我想根据它们的日志级别,服务器和时间范围,知道哪些日志行重复,以及该特定日志行有多少重复。

我尝试了以下查询,其中成功提取了重复的数字:

GET /_all/_search
{
  "query": {
"bool": {
  "must": [        
    {
      "match": {
        "beat.hostname": "server-x"
      }
    },
    {
      "match": {
        "log_level": "WARNING"
      }
    },{
      "range": {
      "@timestamp" : {
        "gte" : "now-48h",
        "lte" : "now"
      }
    }
    }
  ]
}
  },
  "aggs": {
"duplicateNames": {
  "terms": {
    "field": "message_description.keyword",
    "min_doc_count": 2,
    "size": 10000
  }
}
  }
}

它成功地给了我输出:

"aggregations" : {
"duplicateNames" : {
  "doc_count_error_upper_bound" : 0,
  "sum_other_doc_count" : 0,
  "buckets" : [
    {
      "key" : "AuthToken not found [ ]",
      "doc_count" : 657
    }
  ]
}

当我尝试相同的查询时,仅将log_levelWARNING更改为CRITICAL,这给了我0个存储桶。这很奇怪,因为我在Kibana中看到重复的message_description字段值。这与.keywordmessage_description的长度有关吗?

我希望有人可以帮助我解决这个奇怪的问题。

修改: 这是两个具有完全相同的message_description的文档,为什么我不能得到结果?

 {
        "_index" : "filebeat-2019.09.17",
        "_type" : "_doc",
        "_id" : "yYzDP20BiDGBoVteKHjZ",
        "_score" : 10.144365,
        "_source" : {
          "beat" : {
            "name" : "graylog",
            "hostname" : "server-x",
            "version" : "6.8.2"
          },
          "message" : """[2019-09-17 17:06:57] request.CRITICAL: Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444 {"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
          "@version" : "1",
          "source" : "/data/httpd/xxx/xxx/var/log/dev.log",
          "tags" : [
            "beats_input_codec_plain_applied",
            "_grokparsefailure",
            "_dateparsefailure"
          ],
          "timestamp" : "2019-09-17 17:06:57",
          "input" : {
            "type" : "log"
          },
          "offset" : 54819,
          "prospector" : {
            "type" : "log"
          },
          "application" : "request",
          "log_level" : "CRITICAL",
          "stack_trace" : """{"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
          "message_description" : """Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444""",
          "@timestamp" : "2019-09-17T15:06:57.436Z",
          "host" : {
            "name" : "graylog"
          },
          "log" : {
            "file" : {
              "path" : "/data/httpd/xxx/xxx/var/log/dev.log"
            }
          }
        }
      },
      {
        "_index" : "filebeat-2019.09.17",
        "_type" : "_doc",
        "_id" : "CYzDP20BiDGBoVteKHna",
        "_score" : 10.144365,
        "_source" : {
          "beat" : {
            "name" : "graylog",
            "hostname" : "server-x",
            "version" : "6.8.2"
          },
          "message" : """[2019-09-17 17:06:56] request.CRITICAL: Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444 {"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
          "@version" : "1",
          "source" : "/data/httpd/xxx/xxx/var/log/dev.log",
          "tags" : [
            "beats_input_codec_plain_applied",
            "_grokparsefailure",
            "_dateparsefailure"
          ],
          "timestamp" : "2019-09-17 17:06:56",
          "input" : {
            "type" : "log"
          },
          "offset" : 45716,
          "prospector" : {
            "type" : "log"
          },
          "application" : "request",
          "log_level" : "CRITICAL",
          "stack_trace" : """{"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
          "message_description" : """Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444""",
          "@timestamp" : "2019-09-17T15:06:57.426Z",
          "host" : {
            "name" : "graylog"
          },
          "log" : {
            "file" : {
              "path" : "/data/httpd/xxx/xxx/var/log/dev.log"
            }
          }
        }
      }

1 个答案:

答案 0 :(得分:1)

发生的情况是message_description字段长于256个字符,因此长于gets ignored。运行GET filebeat-2019.09.17进行确认。

您可以做的是通过修改字段的映射来增加该限制:

PUT filebeat-*/_doc/_mapping
{
  "properties": {
    "message_description": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 500
        }
      }
    }
  }
}

然后更新这些索引中存在的所有数据:

POST filebeat-*/_update_by_query

完成后,您的查询将再次神奇地工作;-)