我有一个标量变量,其中包含文件内部的一些信息。我的目标是去除包含“Administratively down”字样的任何多行条目的变量(或文件)。
格式类似于:
Ethernet2/3 is up
... see middle ...
a blank line
VlanXXX is administratively down, line protocol is down
... a bunch of text indented by two spaces on multiple lines ...
a blank line
Ethernet2/5 is up
... same format as previously ...
我在想,如果我可以“管理性地向下”和一个前导换行符(对于空白行),我可以对变量应用一些逻辑来删除这些行之间的行。
我现在正在使用Perl,但是如果有人能给我一个这样做的方法,那也可以。
答案 0 :(得分:4)
Perl有一个很少使用的语法,用于将空行用作记录分隔符:-00
标志;有关详细信息,请参阅Command Switches in perl(1)。
例如,给定语料库:
Ethernet2/3 is up
... see middle ...
VlanXXX is administratively down, line protocol is down
... a bunch of text indented by two spaces on multiple lines ...
Ethernet2/5 is up
你可以使用以下单行提取所有pargagraphs 除你不想要的那些:
$ perl -00ne 'print unless /administratively down/' /tmp/corpus
当针对您的语料库进行测试时,单线程产生:
Ethernet2/3 is up
... see middle ...
Ethernet2/5 is up
答案 1 :(得分:0)
那么,您想要从包含“管理性关闭”的行的开头删除并包括下一个空白行(两个连续的换行符)?
$log =~ s/[^\n]+administratively down.+?\n\n//s;
s/
=正则表达式替换
[^\n]+
=任意数量的字符,不包括换行符,后跟
administratively down
=文字文字,后跟
.+?
=任意数量的文字,包括换行符,非贪婪地匹配,然后是
\n\n
=两个换行符
//
=无替换(即删除)
s
=单行模式,允许.
匹配换行符(通常不会)
答案 2 :(得分:0)
您可以使用此模式:
(?<=\n\n|^)(?>[^a\n]++|\n(?!\n)|a(?!dministratively down\b))*+administratively down(?>[^\n]++|\n(?!\n))*+
<强>细节:强>
(?<=\n\n|^) # preceded by a newline or the begining of the string
# all that is not "administratively down" or a blank line, details:
(?> # open an atomic group
[^a\n]++ # all that is not a "a" or a newline
| # OR
\n(?!\n) # a newline not followed by a newline
| # OR
a(?!dministratively down\b) # "a" not followed by "dministratively down"
)*+ # repeat the atomic group zero or more times
administratively down # "administratively down" itself
# the end of the paragraph
(?> # open an atomic group
[^\n]++ # all that is not a newline
| # OR
\n(?!\n) # a newline not followed by a newline
)*+ # repeat the atomic group zero or more times