正则表达式多线替换无法剥离任何东西

时间:2017-04-02 11:27:20

标签: regex perl

我有一个大文本文件,其中包含以下文本示例。有许多“CHARTS”块,总共可能超过200个。

CHARTS
  Color=14671839
  Layer=7
  [0] Font=MS SAN SERIF,10,0,F,F,
'''other lines'''
  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,


CHARTS
 Color=14671839
  Layer=4
  [0] Font=MS SAN SERIF,10,0,F,F,
...other lines...
  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

我试图剥离“Layer = 7”的所有块,这意味着上面的文本最终会以

结尾
CHARTS
 Color=14671839

  Layer=4
  [0] Font=MS SAN SERIF,10,0,F,F,
...other lines...
  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

我使用了以下内容:

$contents =~ s/CHARTS(?s)Layer\=7(?s).*?\[28\]\=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,//g;

我没有得到任何错误,但没有任何东西被剥夺。

我花了最后一个小时玩它但却无处可去。有人能指出我正确的方向吗?

非常感谢

3 个答案:

答案 0 :(得分:1)

我们的想法是忽略除CHARTS以外的Layer属性被移除的所有7块:

$contents =~ s/CHARTS\R*(^(?!CHARTS) *+(?(?=Layer=(?:7\d+|[^7]))(*SKIP)(*F)|.*)\R)*//gm;

Live demo

答案 1 :(得分:0)

假设文件相当是常规的(例如,=Layer=7周围始终没有任何空格,CHARTS始终是唯一的一行)这个Perl单行命令会将过滤后的输出打印到stdout。我相信你知道如何将它重定向到一个新文件?

perl -0777 -ne '/Layer=7\b/ or print for split /^(?=CHARTS)/m' myfile

输出

CHARTS
 Color=14671839
  Layer=4
  [0] Font=MS SAN SERIF,10,0,F,F,
...other lines...
  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

答案 2 :(得分:-1)

可能的解决方案之一可能是:CHARTS方法。 E.g。

  • 拆分CHARTS上的字符串,您将获得一个块列表
  • 过滤掉不需要的元素(块内容为grep)
  • 使用相同的字符串use strict; use warnings; my $contents = do { local $/; <DATA> }; $contents = join 'CHARTS', grep { !/Layer\s*=\s*7\b/ } split /\bCHARTS\b/, $contents; print $contents; __DATA__ CHARTS Color=14671839 Layer=7 [0] Font=MS SAN SERIF,10,0,F,F, '''other lines''' [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, CHARTS Color=14671839 Layer=77 [0] Font=MS SAN SERIF,10,0,F,F, ...other lines... [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, CHARTS Color=14671839 Layer=7 [0] Font=MS SAN SERIF,10,0,F,F, '''other lines''' [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, CHARTS Color=14671839 Layer=4 [0] Font=MS SAN SERIF,10,0,F,F, ...other lines... [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  • 重新加入
CHARTS

 Color=14671839
  Layer=77
  [0] Font=MS SAN SERIF,10,0,F,F,
...other lines...

  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

CHARTS
 Color=14671839

  Layer=4
  [0] Font=MS SAN SERIF,10,0,F,F,

...other lines...
  [28]=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

输出

phone