Question

我尝试编写一个有点复杂的shell脚本。我将举一个例子来描述它。我有一个文件，其中包含以下文字：

othertextbefore
WORDSFRONT
stuffBEGINstuff
stuffMIDstuff
stuffENDstuff
WORDSBACK
WORDSFRONT
stuffDIFFBEGINstuff
stuffDIFFMIDstuff
stuffDIFFENDstuff
WORDSBACK
(repeating)
othertestafter

我需要做的是搜索文件并识别包含在WORDSFRONT和WORDSBACK中的每个块。然后我需要获取找到的块中的内容并在其中进行一些解析/构建文本（基本上提取BEGIN等并用它们重建一个新的文本文件）。

我主要遇到第一部分的问题，我只需要知道如何识别每个文本块然后遍历每个块。

Answer 1

MZLSubscriber.run()

输出

#!/usr/bin/awk -f
/WORDSBACK/ {z=0}
z
/WORDSFRONT/ {z=1}

Answer 2

我只需要知道如何识别每个文本块，然后遍历每个块。

使用awk

从你到目前为止所描述的内容来看，awk是这方面的天然工具。下面说明如何识别块并处理块中的每一行，在这种情况下打印出BEGIN行：

$ awk '/WORDSFRONT/{f=1} f && /BEGIN/{print "Found new block with begin=",$0;} /wordsback/{f=0}' file
Found new block with begin= stuffBEGINstuff
Found new block with begin= stuffDIFFBEGINstuff

在上文中，标志f用于确定我们是否在一个区块中。

使用shell

while IFS= read -r line
do
    [[ $line =~ WORDSFRONT ]] && f=1
    [[ $f == 1 && $line =~ BEGIN ]] && echo "Found new block with begin=$line"
    [[ $line =~ WORDSBACK ]] && f=0
done <file

运行时，上面会产生输出：

Found new block with begin=stuffBEGINstuff
Found new block with begin=stuffDIFFBEGINstuff

Answer 3

使用Perl范围运算符

while (<>)
{
  if ( my $num =  /WORDSFRONT/ .. /WORDSBACK/ ) {
    print "$num\t$.\t$_";
  }
}

$num is the line number within the block. It is 1 when a new block begins. 
When the block ends, "E0" is appended to this.
if ($num == 2), we are at the 2nd line of the current block.
if ($num =~ /E/), we are at the end of the current block.

$. is the line number within the file.
$_ is the actual line

使用给定的示例文件，它会生成以下输出

1       2       WORDSFRONT
2       3       stuffBEGINstuff
3       4       stuffMIDstuff
4       5       stuffENDstuff
5E0     6       WORDSBACK
1       7       WORDSFRONT
2       8       stuffDIFFBEGINstuff
3       9       stuffDIFFMIDstuff
4       10      stuffDIFFENDstuff
5E0     11      WORDSBACK

Bash - 搜索并循环多个字符串匹配

3 个答案:

使用awk

使用shell