使用logstash和弹性搜索进行自定义解析

时间:2014-04-28 14:32:46

标签: elasticsearch logstash

我是logstash的新手! 我配置好,一切正常 - 到目前为止 我的日志文件如下:

2014-04-27 16:24:43 DEBUG b45e66 T+561   10.31.166.155   /v1/XXX<!session> XXX requested for category_ids: only_pro: XXX_ids:14525

如果我使用以下配置文件:

input { file { path => "/logs/*_log" }} output { elasticsearch { host => localhost } }

它会在ES中放置以下内容:

{
  _index: "logstash-2014.04.28",
  _type: "logs",
  _id: "WIoUbIvCQOqnz4tMZzMohg",
  _score: 1,
  _source: {
    message: "2014-04-27 16:24:43 DEBUG b45e66 T+561 10.31.166.155 This is my log !",
    @version: "1",
    @timestamp: "2014-04-28T14:25:52.165Z",
    host: "MYCOMPUTER",
    path: "\logs\xxx_app.log"
   }
  }

如何在日志中记录字符串,以便整个文本不在_source.message?
例如:我希望我能解析它:

 {
  _index: "logstash-2014.04.28",
  _type: "logs",
  _id: "WIoUbIvCQOqnz4tMZzMohg",
  _score: 1,
  _source: {
    logLevel: "DEBUG",
    messageId: "b45e66",
    sendFrom: "10.31.166.155",
    logTimestamp: "2014-04-27 16:24:43",
    message: "This is my log !",
    @version: "1",
    @timestamp: "2014-04-28T14:25:52.165Z",
    host: "MYCOMPUTER",
    path: "\logs\xxx_app.log"
   }
  }

1 个答案:

答案 0 :(得分:1)

您需要通过过滤器解析它,例如grok filter。这可能有点棘手,所以请耐心等待,尝试,尝试。并查看预定义的patterns

您的消息的开头是

%{DATESTAMP} %{WORD:logLevel} %{WORD:messageId} %{GREEDYDATA:someString} %{IP}

grokdebugger是一个非常有用的工具,可以帮助您。

完成后,您的配置应该看起来像

input {
   stdin {}
}
filter {
  grok {
    match => { "message" => "%{DATESTAMP} %{WORD:logLevel} %{WORD:messageId} %{GREEDYDATA:someString} %{IP}" }
  }
}
output {
  elasticsearch { host => localhost }
}