持续的AWK计划

时间:2012-11-08 02:50:51

标签: bash awk log4j

我的任务是编写一个BASH脚本来过滤log4j文件,并将它们通过netcat传递给另一台主机。其中一个要求是脚本必须跟踪已发送到服务器的内容,而不是由于接收服务器上的许可限制而再次发送(服务器上的产品是按每日数据模型许可的)

为了实现过滤,我使用封装在BASH脚本中的AWK。 BASH组件运行正常 - 它是AWK程序,当我试图让它记住已发送到服务器的内容时​​,它让我感到悲伤。我是通过每次一行与我的模式匹配时抓住一行的时间戳来做到这一点的。在程序结束时,最后一个时间戳被写入当前工作目录中的隐藏文件。在程序的连续运行中,AWK会将此文件读入变量。现在,每当一行与模式匹配时,它的时间戳也会与变量中的时间戳进行比较。如果它是较新的,则打印,否则不打印。

期望的输出:

  

INFO 2012-11-07 09:57:12,479 [[artifactid] .connector.http.mule.default.receiver.02] org.mule.api.processor.LoggerMessageProcessor:MsgID = 5017f1ff-1dfa-48c7-a03c -ed3c29050d12 InteractionStatus = Accept InteractionDateTime = 2012-08-07T16:57:33.379 + 12:00 Retailer = CTCT RequestType = RemoteReconnect

隐藏文件:

  

2012-10-11 12:08:19,918

这就是理论,现在是我的问题。

该脚本适用于人为/琐碎的例子,例如:

  

INFO 2012-11-07 09:57:12,479 [[artifactid] .connector.http.mule.default.receiver.02] org.mule.api.processor.LoggerMessageProcessor:MsgID = 5017f1ff-1dfa-48c7-a03c -ed3c29050d12 InteractionStatus = Accept InteractionDateTime = 2012-08-07T16:57:33.379 + 12:00 Retailer = CTCT RequestType = RemoteReconnect

但是,如果我在一个包含堆栈跟踪等的完整日志文件中运行它,那么缩进级别似乎会对我的程序造成严重破坏。程序的第一次运行将产生所需的结果 - 将打印匹配的行并将最新的时间戳写入隐藏文件。再次运行是问题出现的时候。程序的输出包含来自堆栈跟踪等的缩进行(参见下面的块),我无法弄清楚原因。这会填充隐藏文件,因为最后一个匹配的行不包含时间戳,并且会向其写入一些垃圾,使得任何进一步的运行毫无意义。

不受欢迎的输出:

  

at package.reverse.domain.SomeClass.someMethod(SomeClass.java:233)       at package.reverse.domain.processor.SomeClass.process(SomeClass.java:129)       at package.reverse.domain.processor.someClass.someMethod(SomeClassjava:233)       在package.reverse.domain.processor.SomeClass.process(SomeClass.java:129)

隐藏文件:

  

package.reverse.domain.process(SomeClass.java:129)

我的awk程序:

FNR == 1 {
    CMD = "basename " FILENAME
    CMD | getline FILE;
    FILE = "." FILE ".last";
    if (system("[ -f "FILE" ]") == 0) {
        getline FIRSTLINE < FILE;
        close(FILE);
        print FIRSTLINE;
    }
    else {
        FIRSTLINE = "1970-01-01 00:00:00,000";
    }
 }
$0 ~ EXPRESSION {
    if (($2 " " $3) > FIRSTLINE) {
        print $0;
        LASTLINE=$2 " " $3;
    }
}
END {
    if (LASTLINE != "") {
        print LASTLINE > FILE;
    }
}

任何帮助以了解发生这种情况的原因都将不胜感激。

更新:

BASH脚本:

#!/bin/bash
while getopts i:r:e:h:p: option
do
    case "${option}"
    in
        i) INPUT=${OPTARG};;
        r) RULES=${OPTARG};;
        e) PATFILE=${OPTARG};;
        h) HOST=${OPTARG};;
        p) PORT=${OPTARG};;
        ?) printf "Usage: %s: -i <\"file1.log file2.log\"> -r <\"rules1.awk rules2.awk\"> -e <\"patterns.pat\"> -h <host> -p <port>\n" $0;
           exit 1;
    esac
done

#prepare expression with sed
EXPRESSION=`cat $PATFILE | sed ':a;N;$!ba;s/\n/|/g'`;
EXPRESSION="^(INFO|DEBUG|WARNING|ERROR|FATAL)[[:space:]]{2}[[:digit:]]{4}\\\\-[[:digit:]]{1,2}\\\\-[[:digit:]]{1,2}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2},[[:digit:]]{3}.*"$EXPRESSION".*";

#Make sure the temp file is empty
echo "" > .temp;

#input through awk.
for file in $INPUT
do
    awk -v EXPRESSION="$EXPRESSION" -f $RULES $file >> .temp;
done

#send contents of file to splunk indexer over udp
cat .temp;
#cat .temp | netcat -t $HOST $PORT;

#cleanup temporary files
if [ -f .temp ]
then
    rm .temp;
fi

模式文件(我想要匹配的东西):

Warning
Exception

如上所述的Awk脚本。

Example.log

info  2012-09-04 16:00:11,638 [[adr-com-adaptor-stub].connector.http.mule.default.receiver.02] nz.co.amsco.interop.multidriveinterop: session not initialised
error 2012-09-04 16:00:11,639 [[adr-com-adaptor-stub].connector.http.mule.default.receiver.02] nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor: nz.co.amsco.interop.exceptions.systemdownexception
nz.co.amsco.interop.exceptions.systemdownexception
    at nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor.getdeviceconfig(comadaptorprocessor.java:233)
    at nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor.process(comadaptorprocessor.java:129)
    at org.mule.processor.chain.defaultmessageprocessorchain.doprocess(defaultmessageprocessorchain.java:99)
    at org.mule.processor.chain.abstractmessageprocessorchain.process(abstractmessageprocessorchain.java:66)
    at org.mule.processor.abstractinterceptingmessageprocessorbase.processnext(abstractinterceptingmessageprocessorbase.java:105)
    at org.mule.processor.asyncinterceptingmessageprocessor.process(asyncinterceptingmessageprocessor.java:90)
    at org.mule.processor.chain.defaultmessageprocessorchain.doprocess(defaultmessageprocessorchain.java:99)
    at org.mule.processor.chain.abstractmessageprocessorchain.process(abstractmessageprocessorchain.java:66)
    at org.mule.processor.AbstractInterceptingMessageProcessorBase.processNext(AbstractInterceptingMessageProcessorBase.java:105)
    at org.mule.interceptor.AbstractEnvelopeInterceptor.process(AbstractEnvelopeInterceptor.java:55)
    at org.mule.processor.AbstractInterceptingMessageProcessorBase.processNext(AbstractInterceptingMessageProcessorBase.java:105)

用法:

  

./ filter.sh -i&#34; Example.log&#34; -r&#34; rules.awk&#34; -e&#34; patterns.pat&#34; -h host -p port

请注意,此版本中的主机和端口都未使用,因为输出只是抛到stdout上。

所以如果我运行这个,我得到以下输出:

  

info 2012-09-04 16:00:11,638 [[adr-com-adapter-stub] .connector.http.mule.default.receiver.02] nz.co.amsco.interop.multidriveinterop:session not initialised   错误2012-09-04 16:00:11,639 [[adr-com-adapter-stub] .connector.http.mule.default.receiver.02] nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor:nz.co. amsco.interop.exceptions.systemdownexception       在nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor.getdeviceconfig(comadaptorprocessor.java:233)       在nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor.process(comadaptorprocessor.java:129)

如果我在同一个未更改的文件上再次运行它,我应该没有输出但是我看到了:

  

nz.co.amsco.adrcomadaptor.processor.comadaptorprocessor.process(comadaptorprocessor.java:129)

我无法确定发生这种情况的原因。

1 个答案:

答案 0 :(得分:1)

您没有提供任何可以重现您的问题的示例输入,所以让我们从清理脚本开始,然后从那里开始。将其更改为:

BEGIN{
  expression = "^(INFO|DEBUG|WARNING|ERROR|FATAL)[[:space:]]{2}[[:digit:]]{4}-[[:digit:]]{1,2}-[[:digit:]]{1,2}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2},[[:digit:]]{3}.*Exception|Warning"
    # Do you really want "(Exception|Warning)" in brackets instead?
    # As written "Warning" on its own will match the whole expression.
}

FNR == 1 {
    tstampFile = "/" FILENAME ".last"
    sub(/.*\//,".",tstampFile)

    if ( (getline prevTstamp < tstampFile) > 0 ) {
        close(tstampFile)
        print prevTstamp
    }
    else {
        prevTstamp = "1970-01-01 00:00:00,000"
    }

    nextTstamp = ""
}

$0 ~ expression {
    currTstamp = $2 " " $3
    if (currTstamp > prevTstamp) {
        print
        nextTstamp = currTstamp
    }
}

END {
    if (nextTstamp != "") {
        print nextTstamp > tstampFile
    }
}

现在,你还有问题吗?如果是这样,请告诉我们如何运行脚本,即您正在执行的bash命令,并发布一些可以重现问题的小样本输入。