在logstash中为特定日期模式创建Grok过滤器

时间:2018-04-05 19:16:27

标签: elasticsearch logstash logstash-grok

我正在尝试使用ELK堆栈来存储旧日志。 **这不是一个重复的问题。请阅读下面的详细信息。 ** 我想从我的消息中解析时间戳,如下所示:

  

Apr 1 04:01:04 i-b73lj53l journal:152.17.62.1 - - [31 / Mar / 2017:20:01:04 +0000]“GET / api / people / 5913b19b31b0f601004875a5?access_token = rNL7S4A2o5BdbX1QDxbL9L5Vx7j60kGIIhQ1tk9yDYRjUf5e8OKzGGnIDTrMXr5n& filter = %7B%22order%22%3A%22createdAt%20DESC%22%2C%22include%22%3A%5B%7B%22relation%22%3A%22friendships%22%2C%22scope%22%3A%7B%二条%22 %3A%7B%22trashedAt%22%3A%7B%22exists%22%3Afalse%7D%7D%2C%22include%22%3A%5B%22 HTTP / 1.1“200 346”http://api.mywebsite.com/“”Mozilla / 5.0(Windows NT 6.3; WOW64)AppleWebKit / 537.36(KHTML,如Gecko)Chrome / 53.0.2785.104 Safari / 537.36 Core / 1.53.4549.400 Mozilla / 9.7.12900.400“

我已经尝试了超过 20小时,似乎我的配置根本没有被阅读,因为即使我添加 add_field remove_field ,数据没有变化。

  

我已根据文档filebeat文档启用了系统日志。

我的标准输出如下:

   DEBUG    [publish]   pipeline/processor.go:275   Publish event: 


{
  "@timestamp": "2018-04-05T18:53:08.817Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.2.3"
  },
  "source": "/Users/garry/project/sampel",
  "offset": 3231104,
  "tags": [
    "message-log"
  ],
  "prospector": {
    "type": "log"
  },
  "fields": {
    "env": "dev"
  },
  "beat": {
    "name": "Garry-MacBook-Pro-2.local",
    "hostname": "Garry-MacBook-Pro-2.local",
    "version": "6.2.3"
  },
  "message": "Apr 1 04:01:04 i-b73lj53l journal: 152.17.62.1 - - [31/Mar/2017:20:01:04 +0000] "GET /api/people/5913b19b31b0f601004875a5?access_token=rNL7S4A2o5BdbX1QDxbL9L5Vx7j60kGIIhQ1tk9yDYRjUf5e8OKzGGnIDTrMXr5n&filter=%7B%22order%22%3A%22createdAt%20DESC%22%2C%22include%22%3A%5B%7B%22relation%22%3A%22friendships%22%2C%22scope%22%3A%7B%22where%22%3A%7B%22trashedAt%22%3A%7B%22exists%22%3Afalse%7D%7D%2C%22include%22%3A%5B%22 HTTP/1.1" 200 346 "http://api.mywebsite.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4549.400 Mozilla/9.7.12900.400"\"
}

我当前的配置是:

filter {
  grok {
   match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z",  "d/MMM/yyyy:HH:mm:ss Z" ]
  }
}

1 个答案:

答案 0 :(得分:0)

不确定您的grok过滤器正在做什么,但您的日志是syslog,因此您只需使用%{SYSLOGLINE}

创建过滤器

然后您可以解析[31/Mar/2017:20:01:04 +0000]存储在消息field中的内容,如下所示,

\[%{MONTHDAY:monthday}/%{MONTH:month}/%{YEAR:year}:%{TIME:time}.%{ISO8601_TIMEZONE:Timezone}\]

将产生以下输出,

{
  "monthday": [
    "31"
  ],
  "month": [
    "Mar"
  ],
  "year": [
    "2017"
  ],
  "time": [
    "20:01:04"
  ],
  "HOUR": [
    "20",
    "00"
  ],
  "MINUTE": [
    "01",
    "00"
  ],
  "SECOND": [
    "04"
  ],
  "Timezone": [
    "+0000"
  ]
}