Question

我正在创建一个logstash grok过滤器来从备份服务器中提取事件，我希望能够测试一个模式的字段，如果它与模式匹配，则进一步处理该字段并提取其他信息

为此，我在if语句中嵌入了grok语句。这导致测试在Error: Expected one of #, =>之后立即if失败。

这是过滤器声明：

filter {
    grok {
        patterns_dir => "./patterns"
        # NetWorker logfiles have some unusual fields that include undocumented engineering codes and what not
        # time is in 12h format (ugh) so custom patterns need to be used.
        match => [ "message", "%{NUMBER:engcode1} %{DATESTAMP_12H:timestamp}  %{NUMBER:engcode2} %{NUMBER:engcode3} %{NUMBER:engcode4} %{NUMBER:ppid} %{NUMBER:pid} %{NUMBER:engcode5} %{WORD:processhost} %{WORD:processname} %{GREEDYDATA:daemon_message}" ]
        # attempt to find completed savesets and pull that info from the daemon_message field
        if [daemon_message] =~ /done\ saving\ to\ pool/  { 
            grok {
                match => [ "daemon_message", "%{WORD:savehost}\:%{WORD:saveset} done saving to pool \'%{WORD:pool}\' \(%{WORD:volume}\) %{WORD:saveset_size}" ]
            }
        }
    }
    date {
        # This is requred to set the time from the logline to the timestamp and not have it create it's own.
        # Note the use of the trailing 'a' to denote AM or PM. 
        match => ["timestamp", "MM/dd/yyyy HH:mm:ss a"]
    } 
}

此块失败并显示以下内容：

$ /opt/logstash/bin/logstash -f ./networker_daemonlog.conf --configtest
Error: Expected one of #, => at line 12, column 12 (byte 929) after # Basic dumb simple networker daemon log grok filter for the NetWorker daemon.log 
# no smarts to this and not really pulling any useful info from the files (yet)
filter {
    grok {
... lines deleted ...
        # attempt to find completed savesets and pull that info from the daemon_message field
        if

我是logstash的新手，我意识到在grok语句中使用条件可能是不可能的，但我更喜欢以这种方式对其他match行进行条件处理，因为这将raemon_message字段保留原样用于其他用途，同时提取我想要的数据。

ETA：我还应该指出，完全删除if语句允许configtest通过，过滤器可以解析日志。

提前致谢...

Answer 1

条件超出过滤器的范围，例如：

if [field] == "value" {
     grok {
          ...
     }
]

是正确的。在你的情况下，做第一个grok，然后测试运行第二个，即：

grok {
    match => [ "message", "%{NUMBER:engcode1} %{DATESTAMP_12H:timestamp}  %{NUMBER:engcode2} %{NUMBER:engcode3} %{NUMBER:engcode4} %{NUMBER:ppid} %{NUMBER:pid} %{NUMBER:engcode5} %{WORD:processhost} %{WORD:processname} %{GREEDYDATA:daemon_message}" ]
}
if [daemon_message] =~ /done\ saving\ to\ pool/  {
    grok {
        match => [ "daemon_message", "%{WORD:savehost}\:%{WORD:saveset} done saving to pool \'%{WORD:pool}\' \(%{WORD:volume}\) %{WORD:saveset_size}" ]
    }  
}

这实际上是为匹配的记录运行两个正则表达式。由于grok只会在正则表达式匹配时创建字段，因此您可以这样做：

grok {
    match => [ "message", "%{NUMBER:engcode1} %{DATESTAMP_12H:timestamp}  %{NUMBER:engcode2} %{NUMBER:engcode3} %{NUMBER:engcode4} %{NUMBER:ppid} %{NUMBER:pid} %{NUMBER:engcode5} %{WORD:processhost} %{WORD:processname} %{GREEDYDATA:daemon_message}" ]
}
grok {
    match => [ "daemon_message", "%{WORD:savehost}\:%{WORD:saveset} done saving to pool \'%{WORD:pool}\' \(%{WORD:volume}\) %{WORD:saveset_size}" ]
}

您必须衡量实际日志文件的性能，因为这样会产生更少的regexp，但第二个更复杂。

如果你真的想要坚持下去，你可以使用break_on_match功能在一个grok {}中完成所有这些。

在grok语句中使用logstash if语句

1 个答案: