Question

我最近才开始学习正则表达式。我的第一次入侵是通过Windows上的Notepad ++搜索和替换对话框。现在我意识到使用其他工具在线间匹配模式似乎并不容易。在Notepad ++中，我只使用\ n。

使用Perl在命令行处理正则表达式，如果我使用“slurp mode”，我会有一个相对容易的时间。我可以使用像

这样的行

perl -0777 -pe 's/pattern/replace-text/' foo.txt

和“pattern”可以有我想要的数量。

对于Linux命令行，我可以使用哪些替代方法来使用包含（\ r）\ n的正则表达式？文本跨越行中的匹配模式对我来说尤为重要。

Answer 1

如果您安装了Perl Compatible Regular Expressions，请查看pcregrep。（如果安装了pcre2，它将是pcre2grep。）无论如何，如果还安装了手册页，请查看dash-M（LATIN CAPITAL LETTER M）选项，它允许您匹配多行。如果您没有安装手册页，可以访问pcre-dot-org，所有文档都在那里。

接下来是几个例子，但首先是输入文件：

$ cat malt
this is foo
bar baz

this is foo'd up
beyond all barz

this is foo
        bar

foo
  bar

blah blah foobar blah

现在，匹配换行符的正则表达式，如示例所示：

$ pcregrep -M 'foo\nbar' malt
this is foo
bar baz

对于以下内容，我将使用dash-n选项（与grep：print行号相同）以使其更加明显有多少匹配，以及匹配的第一部分在哪一行发生。在这里，我试图匹配“foo”后跟一个换行符，零或多个空格（即可选），然后是“bar”：

$ pcregrep -nM 'foo\n\s*bar' malt
1:this is foo
bar baz
7:this is foo
        bar
10:foo
  bar

这次（使用可选的空格），我们匹配三次，分别从第1行，第7行和第10行开始。另一个考虑因素是你是否想让点（FULL STOP）匹配换行。这可以使用(?s)模式修饰符来完成，例如：

$ pcregrep -nM '(?s:foo.*bar)' malt
1:this is foo
bar baz

this is foo'd up
beyond all barz

this is foo
        bar

foo
  bar

blah blah foobar blah

请务必阅读“贪婪”与“懒惰”匹配模式。请注意，上面匹配的是一次，从第1行的“foo”开始：一直到文件上的最后一栏。与“懒惰”消费的工作方式相比，差异是显着的 - 我们可以使用?量化行为修改器来做到这一点：

$ pcregrep -nM '(?s:foo.*?bar)' malt
1:this is foo
bar baz
4:this is foo'd up
beyond all barz
7:this is foo
        bar
10:foo
  bar
13:blah blah foobar blah

后者与前者相同，只有'？'懒惰的行为修饰符。

Linux命令行正则表达式处理工具，便于多行操作？

1 个答案: