Question

我有一个包含以下内容的文件。我试图提取具有匹配的开始和结束模式的块，在这两者之间，我想排除具有不匹配的数字ID（也许是模式）的块。这里必须排除[001]以外的内容。 002可能未知。因此，我希望仅与[001]匹配的块。

文件包含

    text [001] start
    line 1
    line 2
    text [002] mid start
    line 3     
    line 4
    text [002] mid end
    line 5
    line 6
    text [001] end

我需要一个块，但要排除不匹配的数字ID [002]的块。

    text [001] start
    line 1
    line 2
    line 5
    line 6
    text [001] end

对于这个问题，我无法在互联网上获得清晰的说明。任何人都可以提供有关该，awk或sed解决方案的帮助吗？

要获取具有开始和结束模式的代码块，我正在尝试

   awk '/[001]/ && /start/, /001/ && /end/' File

Answer 1

使用sed或Perl：

sed '/001.*start/,/001.*end/!d;/002.*start/,/002.*end/d'

perl -ne 'print if /001.*start/ .. /001.*end/
                and not /002.*start/ .. /002.*end/'

使用先行断言可以轻松地使排除的标签动态化：

perl -ne 'print if /001.*start/ .. /001.*end/
                and not /text \[(?!001).*start/ .. /text \[(?!001).*end/'

Answer 2

此awk可能有用。您可能需要调整触发器才能处理数据：

awk '/\[001\] start/{f=1} /\[002\] .* start/{f=0} f;  /\[001\] end/{f=0}  /\[002\] .* end/{f=1}' file
    text [001] start
    line 1
    line 2
    line 5
    line 6
    text [001] end

更具可读性

awk '
    /\[001\].*start/ {f=1}
    /\[002\].*start/ {f=0} 
    f;  
    /\[001\].*end/ {f=0}
    /\[002\].*end/ {f=1}
    ' file

只需更改触发代码即可反映真实数据。

Answer 3

假设我们在第1块中使用变量b1，在第2块中使用变量b2。

awk '/001/ && /start/ { b1=1 }
     /002/ && /start/ { b2=1 }
     (b1 && !b2)
     /002/ && /end/   { b2=0 }
     /001/ && /end/   { b1=0 }' file

范围表达式很方便，但是要引述Ed Morton：请不要使用范围表达式（例如/start/,/end/），因为它们会使琐碎的任务变得简短一些，但是需要重复条件或完整重写最小的需求更改。

Answer 4

假设您的代码块嵌套到任意深度，并且永不重叠：

$ cat tst.awk
BEGIN { tgtId="001" }

match($0,/\[[0-9]+\]/) {
    id = substr($0,RSTART+1,RLENGTH-2)
    state = $NF
}

state == "start"  { isTgtBlock[++depth] = (id == tgtId ? 1 : 0) }

isTgtBlock[depth] { print }

state == "end"    { --depth }

{ id = state = "" }

$ awk -f tst.awk file
    text [001] start
    line 1
    line 2
    line 5
    line 6
    text [001] end

Answer 5

这可能对您有用（GNU sed）：

sed -n '/\[001\]/,/\[001\]/{/\[002\]/,/\[002\]/!p}' file

仅打印[001]分隔符之间的行，并排除[002]分隔符之间的行。

提取两个模式之间的行，并使用if条件删除行之间

5 个答案: