如何从logstash中的日志消息中获取数字?

时间:2014-02-19 11:44:15

标签: syslog logstash kibana

我对logstash非常新。 我可以运行logstash jar文件并查看kibana网页。 很酷~~

现在,我想将以下行(系统日志消息)更改为下一行。

Feb 19 18:45:29 SD550 Jack: REG,0x1000,4,10,20,30,40
==>
{ 'timestamp': 'Feb 19 18:45:29', 
  'host': 'SD550', 0x1000:10, 0x1001:20, 0x1002:30, 0x1003:40 }

在日志消息中,'0x1000'是起始寄存器地址,'4'是寄存器值的数量,下一个值只是值。因此,这意味着0x1000:10,0x1001:20,0x1002:30,0x1003:40。 重要的一点是寄存器值的数量能够改变。因此,日志消息的长度可以变化。虽然它有任何长度,但我希望得到一个合适的结果。 (例如,0x2000,2,12,22 ==> 0x2000:12,0x2001:22)

这是我对logstash的不完整配置文件。我发现了一些过滤器,如grok,mutate和extractnumbers。但是,我不知道该怎么做我想做的事。

input { 
  file { 
        path => "/var/log/syslog"
        type => "syslog"
  } 
}

filter {
   ???
}

output {
  elasticsearch { }
}

我知道我想要很多,抱歉。此外,我的最终目标是为kibana中的特定寄存器绘制TIME(x)/ VALUE(y)图表。可能吗?我可以给你一些建议吗?

谢谢你, Youngmin Kim

4 个答案:

答案 0 :(得分:0)

你会想要使用grok来匹配各个字段,有许多内置的grok模式可以帮助你解决这个问题。 %{SYSLOGBASE}将为您获取时间戳和主机,然后可以使用%{NUMBER}等模式以及https://github.com/logstash/logstash/blob/v1.3.3/patterns/grok-patterns

中找到的其他模式获取其余模式

由于您的变量日志长度,您的模式可能会变得有点复杂,但是我认为您只需匹配所有数字并将它们存储在数组中然后在变异中就可以将它们映射到寄存器中值。

就在kibana中生成图表而言,一旦数据格式正确,这将不会非常困难。有一个内置的时间序列图表类型,易于填充。

答案 1 :(得分:0)

2月19日18:45:29 SD550杰克:REG,0x1000,4,10,20,30,40

如果您对上面的数据使用以下配置文件,并打开kibana,那么它可以正常工作。它会将字段拆分为您可以搜索的不同类别。我对这一切都很陌生,但这就是我要做的。下面的屏幕截图是一个简单的时间饼图我用上面的8行输入了不同的时间和地址值

input {
  tcp { 
    type => "test"
    port => 3333
  } 
}


filter {
    grok {
        match => ["message", "%{MONTH:month} %{DAY:day} %{TIME:time} %{WORD:sd550} %{WORD:name}: %{WORD:asmThing},%{WORD:address},%{NUMBER:firstno}%{NUMBER:2no}%{NUMBER:3no}%{NUMBER:4no}%{NUMBER:5no}"]
}
 }
output {
  elasticsearch {
    # Setting 'embedded' will run  a real elasticsearch server inside logstash.
    # This option below saves you from having to run a separate process just
    # for ElasticSearch, so you can get started quicker!
    embedded => true
  }
}

Test Kibana

答案 2 :(得分:0)

我有一个想法。要处理具有多个寄存器地址:value的变量日志长度, 您可以使用grok过滤器首先过滤邮件。然后使用csv过滤器分隔每个寄存器值。

过滤:

filter {
    grok {
            match => ["message", "%{MONTH:month} %{NUMBER:day} %{TIME:time} %{WORD:host} %{WORD:user}: %{WORD:unit},%{WORD:address},%{NUMBER:regNumber},%{GREEDYDATA:regValue}"]
            add_field => ["logdate","%{month} %{day} %{time}"]
            remove_field => ["month","day", "time"]
    }

    csv {
            source => "regValue"
            remove_field => ["regValue"]
    }
}

输出:

{
   "message" => "Feb 19 18:45:29 SD550 Jack: REG,0x1000,4,10,20,30,40",
  "@version" => "1",
"@timestamp" => "2014-02-20T02:05:53.608Z",
      "host" => "SD550"
      "user" => "Jack",
      "unit" => "REG",
   "address" => "0x1000",
 "regNumber" => "4",
   "logdate" => "Feb 19 18:45:29",
   "column1" => "10",
   "column2" => "20",
   "column3" => "30",
   "column4" => "40"
}

但是,地址字段名称由csv过滤器给出(您不能通过CSV filter column给出字段名称,因为字段数是可变的)。如果您想满足您的要求,则需要修改csv filter

答案 3 :(得分:0)

谢谢大家回答我的问题.. 特别是Ben Lim。

在你的帮助下,我得到了这个结果。

{
      "@version" => "1",
    "@timestamp" => "2014-02-20T11:07:28.125Z",
          "type" => "syslog",
          "host" => "ymkim-SD550",
          "path" => "/var/log/syslog",
            "ts" => "Feb 20 21:07:27",
          "user" => "ymkim",
          "func" => "REG",
          "8192" => 16,
          "8193" => 32,
          "8194" => 17,
          "8195" => 109
}

来自$ logger REG,2000,4,10,20,11,6d

这是我的配置文件。

input { 
  file { 
        path => "/var/log/syslog"
        type => "syslog"
  } 
}

filter {
  grok {
        match => ["message", "%{SYSLOGTIMESTAMP:ts} %{SYSLOGHOST:hostname} %{WORD:user}: %{WORD:func},%{WORD:address},%{NUMBER:regNumber},%{GREEDYDATA:regValue}"]
  }

  if [func] == "REG" {  
      modbus_csv {
          start_address => "address"
          num_register => "regNumber"
          source => "regValue"
          remove_field => ["regValue", "hostname", "message", 
                "address", "regNumber"]
      }
  }

}

output {
    stdout { debug => true }
  elasticsearch { }
}

和修改后的csv过滤器,名为modbus_csv.rb。

# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"

require "csv"

# CSV filter. Takes an event field containing CSV data, parses it,
# and stores it as individual fields (can optionally specify the names).
class LogStash::Filters::MODBUS_CSV < LogStash::Filters::Base
  config_name "modbus_csv"
  milestone 2

  # The CSV data in the value of the source field will be expanded into a
  # datastructure.
  config :source, :validate => :string, :default => "message"

  # Define a list of column names (in the order they appear in the CSV,
  # as if it were a header line). If this is not specified or there
  # are not enough columns specified, the default column name is "columnX"
  # (where X is the field number, starting from 1).
  config :columns, :validate => :array, :default => []
  config :start_address, :validate => :string, :default => "0"
  config :num_register, :validate => :string, :default => "0"

  # Define the column separator value. If this is not specified the default
  # is a comma ','.
  # Optional.
  config :separator, :validate => :string, :default => ","

  # Define the character used to quote CSV fields. If this is not specified
  # the default is a double quote '"'.
  # Optional.
  config :quote_char, :validate => :string, :default => '"'

  # Define target for placing the data.
  # Defaults to writing to the root of the event.
  config :target, :validate => :string

  public
  def register

    # Nothing to do here

  end # def register

  public
  def filter(event)
    return unless filter?(event)

    @logger.debug("Running modbus_csv filter", :event => event)

    matches = 0

    @logger.debug(event[@num_register].hex)
    for i in 0..(event[@num_register].hex)
        @columns[i] = event[@start_address].hex + i
    end
    if event[@source]
      if event[@source].is_a?(String)
        event[@source] = [event[@source]]
      end

      if event[@source].length > 1
        @logger.warn("modbus_csv filter only works on fields of length 1",
                     :source => @source, :value => event[@source],
                     :event => event)
        return
      end

      raw = event[@source].first
      begin
        values = CSV.parse_line(raw, :col_sep => @separator, :quote_char => @quote_char)

        if @target.nil?
          # Default is to write to the root of the event.
          dest = event
        else
          dest = event[@target] ||= {}
        end

        values.each_index do |i|
          field_name = @columns[i].to_s || "column#{i+1}"
          dest[field_name] = values[i].hex
        end

        filter_matched(event)
      rescue => e
        event.tag "_modbus_csvparsefailure"
        @logger.warn("Trouble parsing modbus_csv", :source => @source, :raw => raw,
                      :exception => e)
        return
      end # begin
    end # if event

    @logger.debug("Event after modbus_csv filter", :event => event)

  end # def filter

end # class LogStash::Filters::Csv

最后,我得到了一张我想要的图表。 (* func = REG(13)4096平均每10米|(13次点击))