Question

我有一个要使用bash处理的文件。可以与awk，sed或grep或类似版本一起使用。该文件在一行上出现多次。我想提取这两个事件之间的所有内容，并将输出分别打印在单独的行上。

我已经尝试使用此功能

cat file.txt | grep -o 'pattern1.*pattern2'

但这将打印从pattern1到最后一个匹配的pattern2的所有内容。

$ cat file.txt
pattern1 this is the first content pattern2 this is some other stuff pattern1 this is the second content pattern2 this is the end of the file.

我想得到：

pattern1 this is the first content pattern2
pattern1 this is the second content pattern2

Answer 1

这可能对您有用（GNU sed）：

sed -n '/pattern1.*pattern2/{s/pattern1/\n&/;s/.*\n//;s/pattern2/&\n/;P;D}' file

将选项-n设置为显式打印。

仅包含pattern1后跟pattern2的处理行。

将换行符添加到pattern1。

删除并包括引入的换行符。

在pattern2之后添加换行符。

在图案空间中打印第一行，将其删除并重复。

Answer 2

尝试gnu sed：

 sed -E 's/(pattern2).*(pattern1)(.*\1).*/\1\n\2\3/' file.txt

Answer 3

如果您无权访问支持环视的工具，则这种方法虽然冗长，但可以在任何UNIX机器上使用标准工具来可靠地工作：

awk '{
    gsub(/@/,"@A"); gsub(/{/,"@B"); gsub(/}/,"@C"); gsub(/pattern1/,"{"); gsub(/pattern2/,"}")
    out = ""
    while( match($0,/{[^{}]*}/) ) {
        out = (out=="" ? "" : out ORS) substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
    $0 = out
    gsub(/}/,"pattern2"); gsub(/{/,"pattern1"); gsub(/}/,"@C"); gsub(/{/,"@B"); gsub(/@A/,"@")
} 1' file

以上方法通过创建输入中不存在的字符来工作（首先将那些字符{和}更改为其他字符串@B和@C）因此它可以使用否定字符类中的那些字符来查找目标字符串，然后将所有更改的字符返回其原始值。这是一些印刷品，可以使每个步骤中发生的事情更加明显：

awk '{
    print "1): " $0 ORS
    gsub(/@/,"@A"); gsub(/{/,"@B"); gsub(/}/,"@C"); gsub(/pattern1/,"{"); gsub(/pattern2/,"}")
    print "2): " $0 ORS
    out = ""
    while( match($0,/{[^{}]*}/) ) {
        out = (out=="" ? "" : out ORS) substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
    $0 = out
    print "3): " $0 ORS
    gsub(/}/,"pattern2"); gsub(/{/,"pattern1"); gsub(/}/,"@C"); gsub(/{/,"@B"); gsub(/@A/,"@")
    print "4): " $0 ORS
} 1' file
1): pattern1 this is the first content pattern2 this is some other stuff pattern1 this is the second content pattern2 this is the end of the file.

2): { this is the first content } this is some other stuff { this is the second content } this is the end of the file.

3): { this is the first content }
{ this is the second content }

4): pattern1 this is the first content pattern2
pattern1 this is the second content pattern2

pattern1 this is the first content pattern2
pattern1 this is the second content pattern2

如何在单独的行上打印多个图案

3 个答案: