正则表达式配置为logstash中的动态列

时间:2014-11-27 07:23:15

标签: regex logstash logstash-grok

我有以下两行粘贴的日志文件:

  

11月26日14:20:32 172.16.0.1 date = 2014-11-26 time = 14:18:37 devname = XXXXCCCFFFFF devid = XXXCCVVGFFDD logid = 3454363464 type = traffic subtype = forward level = notice vd = root srcip = 172.16.1.251 srcport = 62032 srcintf =" Combo_LAN" dstip = X.X.X.X dstport = X dstintf =" wan2" sessionid = 16172588 status = close user =" X.X" group =" Open Group" policyid = 2 dstcountry ="美国" srccountry ="保留" trandisp = snat transip = X.X.X.X transport = X service = HTTP proto = 6 applist =" Block_Applications" duration = 11 sentbyte = 2377 rcvdbyte = 784 sentpkt = 6 rcvdpkt = 7 identidx = 5 utmaction = passthrough utmevent = webfilter utmsubtype = ftgd-cat urlcnt = 1 hostname =" tacoda.at.atwola.com" catdesc ="广告"

     

11月26日14:20:32 172.16.0.1 date = 2014-11-26 time = 14:18:37 devname = XXXXCCCFFFFF devid = XXXCCVVGFFDD logid = 3454363464 type = utm subtype = webfilter eventtype = ftgd_allow level = notice vd = "根" policyid = 2 identidx = 5 sessionid = 15536​​743 user =" X.X" srcip = X.X.X.X srcport = X srcintf =" Combo_LAN" dstip = X.X.X.X dstport = 80 dstintf =" wan2"服务=" HTTP"主机名=" streaming.sbismart.com" profiletype =" Webfilter_Profile" profile ="打开Group_Policy"状态="直通" reqtype ="直接" URL =" /扩散/" sentbyte = 984 rcvdbyte = 202 msg =" URL属于策略中允许的类别" method = domain class = 0 cat = 18 catdesc ="经纪和交易"

我的问题是,如果列数和顺序是固定的,我可以解析数据。

但是,我如何解析配置文件中的动态列,以便我不能获得_grokparsefailure

2 个答案:

答案 0 :(得分:1)

Ruby插件可以为您提供帮助。

以下是配置:

input {
    stdin{
    }
}

filter {
    ruby {
        code => '
            msg = event["message"]
            msgIndex = msg.index("date=")
            msgInsert = msg[msgIndex..-1]
            msgMap = msgInsert.scan(/(\w+)=("(.*?)"|([^ ]+))/).map { |(first, second)| [first, second] }
            for x in msgMap
                key = x[0]
                value = x[1]
                event[key] = value
            end
        '
    }
}

output {
    stdout{
        codec => rubydebug
    }
}
  1. 首先,通过索引获取所有key = value对的起始值date=
  2. 然后将所有键值映射到字符串数组。
  3. 使用For循环插入所有值。
  4. 我已经尝试了你的日志,我可以用值创建所有对应的字段。 希望这可以帮到你

答案 1 :(得分:1)

避免grokparsefailure的简单答案是提供与您的输入相匹配的有效模式。也就是说,您的问题似乎暗示字段并不总是按此顺序指定。举个例子,你应该使用" kv"过滤以将这些键/值对拆分为字段。