Question

亲爱的，我正在编写一个从.gz文件中检索edifact日志消息的python程序... 2个日志的示例如下：

2009/03/02 12:13:59.642396 siamp102 mux1-30706 Trace name: MSG
Message sent [con=251575 (APEOBEinMux1), len=2106, CorrID=000182C42DE0ED]
UNB+IATB:1+1ASRPFA+1A0APE+090302:1213+0095JQOL2

2009/03/02 12:14:00.029496 siamp102 mux1-30706 Trace name: MSG
Message sent [con=737 (APIV2_1), len=22370, CorrID=000182C42DE0ED]
UNB+IATB:1+1ASIFQLFS+1ARIOFS+090302:1214+0122V11ON9

我想写一个正则表达式，能够匹配第一行的某些字段，一些来自第二行，另一些来自第三行...

有没有办法编写一个与GREP一起使用的正则表达式，它匹配连续行的字段？

提前致谢!!!

Answer 1

检查此前一个帖子，您可能会得到您正在寻找的答案：bash grep newline

请参阅pcregrep答案，pcregrep -M允许多行匹配。

Answer 2

单独grep，我认为这是不可能的。我建议使用awk或perl，以便能够保存上一行的某些上下文。

在perl中，这就像：

#!/usr/bin/env perl

$isInLogSection = 'NO';
while (<>) {
    if ( /siamp102/ ) {
        # Start of log section: retrieve its ID
        $isInLogSection = 'YES';
        split;
        $logSectionID = $_[0];
    }

    if ($isInLogSection eq YES && /len=/) {
        # Retrieve value of len
        ...
    }

    if ( /^$/ ) {
        # End of log section
        $isInLogSection = 'NO';
    }
}

在awk中，这就像：

BEGIN { isInLogSection = "NO"; }
/siamp102/ { isInLogSection = "YES"; logSectionID = $1; }
/len=/ { if (isInLogSection == "YES") { #retrieve len value } }
/^$/ { isInLogSection = "NO" }

我不是100％确定语法。这主要是用于说明原则的画布。

Grep Regular-Expressions更多行

2 个答案: