Perl正则表达式,用于更改容器中的所有模式

时间:2012-05-28 08:01:19

标签: regex perl

有一个字符串(仅用于测试),我想替换div <p>下的<div id="text">的所有实例。我该怎么做?

我使用ms修饰符进行了测试,但是徒劳(只有第一个被替换)。我在下面给出了我的Perl代码:

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s/(<div id="text">.*?)<p>(.*)(<\/p>.*?<\/div>)/$1<p style="text-align:left;">$2 $3/sig;

print "Sring now is : \n$string\n\n";

4 个答案:

答案 0 :(得分:2)

使用XML::XSH2

open :F html 1.html ;
for //div[@id="text"]/p
    set @style "text-align:left;" ;
save :b ;

答案 1 :(得分:0)

试试这个

(?is)<p>.+?</p>(?=.*?</div>)

<强>代码

$subject =~ s!(?is)<p>.+?</p>(?=.*?</div>)!!g;

<强>解释

"
(?is)        # Match the remainder of the regex with the options: case insensitive (i); dot matches newline (s)
<p>          # Match the characters “<p>” literally
.            # Match any single character
   +?           # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
</p>         # Match the characters “</p>” literally
(?=          # Assert that the regex below can be matched, starting at this position (positive lookahead)
   .            # Match any single character
      *?           # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
   </div>       # Match the characters “</div>” literally
)
"

<强>更新

按如下方式更改您的代码:

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s!(?is)<p>.+?</p>(?=.*?</div>)!!g;;

print "Sring now is : \n$string\n\n";

那个脚本完全给出了构建的内容。显示<p>div元素以外的所有内容。

答案 2 :(得分:0)

首先,我需要说我使用了这篇文章中解释的技巧Passing a regex substitution as a variable in Perl?

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s/(<div id="text">.*?)<p>(.*)(<\/p>.*?<\/div>)/$1<p style="text-align:left;">$2 $3/sig;

sub modify
{
  my($text, $code) = @_;
  $code->($text);
  return $text;
}

my $new_text = modify($string, sub {
    my $div = '(<div id="text">.*?</div>)';
    $string =~ m#$div#is;
    my $found = $1;
print "found : \n$found\n\n";
    my $repl = modify ($found, sub {
         $_[0] =~ s/<p>/<p style="text-align:left;">/g
    }) ;
    $_[0] =~ s/$found/$repl/ 
});

print "Result : \n$new_text\n\n";

诀窍是使用修改子允许对文本进行高阶拖曳。然后我们可以隔离<div id="text">...</div>并在其上应用<p>的替代。

答案 3 :(得分:0)

谢谢大家的帮助。

我可以找到一个正则表达式。所以我用“解决方法”做到了。这是如何:

while( $val =~ s/(<div id="article">.*?)<p>/$1<p style="text-align:left;">/sig )
{  }

所以基本上正则表达式只适用于第一个匹配,这就是为什么我们在空循环中重复它(当没有更多匹配要替换时循环退出)。