我尝试编写一个有点复杂的shell脚本。我将举一个例子来描述它。我有一个文件,其中包含以下文字:
othertextbefore
WORDSFRONT
stuffBEGINstuff
stuffMIDstuff
stuffENDstuff
WORDSBACK
WORDSFRONT
stuffDIFFBEGINstuff
stuffDIFFMIDstuff
stuffDIFFENDstuff
WORDSBACK
(repeating)
othertestafter
我需要做的是搜索文件并识别包含在WORDSFRONT和WORDSBACK中的每个块。然后我需要获取找到的块中的内容并在其中进行一些解析/构建文本(基本上提取BEGIN等并用它们重建一个新的文本文件)。
我主要遇到第一部分的问题,我只需要知道如何识别每个文本块然后遍历每个块。
答案 0 :(得分:2)
MZLSubscriber.run()
输出
#!/usr/bin/awk -f
/WORDSBACK/ {z=0}
z
/WORDSFRONT/ {z=1}
答案 1 :(得分:1)
我只需要知道如何识别每个文本块,然后遍历每个块。
从你到目前为止所描述的内容来看,awk是这方面的天然工具。下面说明如何识别块并处理块中的每一行,在这种情况下打印出BEGIN行:
$ awk '/WORDSFRONT/{f=1} f && /BEGIN/{print "Found new block with begin=",$0;} /wordsback/{f=0}' file
Found new block with begin= stuffBEGINstuff
Found new block with begin= stuffDIFFBEGINstuff
在上文中,标志f
用于确定我们是否在一个区块中。
while IFS= read -r line
do
[[ $line =~ WORDSFRONT ]] && f=1
[[ $f == 1 && $line =~ BEGIN ]] && echo "Found new block with begin=$line"
[[ $line =~ WORDSBACK ]] && f=0
done <file
运行时,上面会产生输出:
Found new block with begin=stuffBEGINstuff
Found new block with begin=stuffDIFFBEGINstuff
答案 2 :(得分:0)
使用Perl范围运算符
while (<>)
{
if ( my $num = /WORDSFRONT/ .. /WORDSBACK/ ) {
print "$num\t$.\t$_";
}
}
$num is the line number within the block. It is 1 when a new block begins.
When the block ends, "E0" is appended to this.
if ($num == 2), we are at the 2nd line of the current block.
if ($num =~ /E/), we are at the end of the current block.
$. is the line number within the file.
$_ is the actual line
使用给定的示例文件,它会生成以下输出
1 2 WORDSFRONT
2 3 stuffBEGINstuff
3 4 stuffMIDstuff
4 5 stuffENDstuff
5E0 6 WORDSBACK
1 7 WORDSFRONT
2 8 stuffDIFFBEGINstuff
3 9 stuffDIFFMIDstuff
4 10 stuffDIFFENDstuff
5E0 11 WORDSBACK