搜索并替换特定标记之间的内容

时间:2014-11-25 08:35:43

标签: perl html-parser

#!/usr/bin/perl
use strict;
use warnings;
my $html = q|
    <html>
    <head>
    <style>
    .classname{
        color: red;
    }
    </style>
    </head>
    <body>
    classname will have a color property.
    </body>
    </html>
|;
$html=~s/classname/NEW/g;
print $html;

这取代了两个地方的classname。如何将替换仅限制为<body>的内容?我希望看到它使用HTML::ParserHTML::TreeBuilder完成。

1 个答案:

答案 0 :(得分:3)

我相信这可以做你想要的,使用HTML :: TreeBuilder在body元素的所有子元素上用你的regexp替换classname。

我添加了另一个虚拟div来输入以确保它被正确处理。

#!/usr/bin/perl
use strict;
use warnings;

use HTML::TreeBuilder;

my $html = q|
    <html>
    <head>
    <style>
    .classname{
        color: red;
    }
    </style>
    </head>
    <body>
    classname will have a color property.
    <div>more text with classname in it</div>
    </body>
    </html>
|;

my $tree = HTML::TreeBuilder->new_from_content($html);

replace_text( $tree->find_by_tag_name("body") );

print $tree->as_HTML."\n";

sub replace_text {

    my $html_element = shift;

    for my $el ( $html_element->content_refs_list ){

    if ( ref( $$el ) ){
        replace_text( $$el );
        next;
    }

    $$el =~ s /classname/NEW/g;

    }

    return $html_element;

}