我有html内容,如下所示:
html code ... </div>content1</div> html code ...
html code ... </div>content2</div> html code ...
我想从HTML中提取content1 / 2/3 ...作为content1新行content2新行content3任何想法?提前致谢。
答案 0 :(得分:0)
以下是使用Mojo::DOM启发this StackOverflow answer的示例:
#!/usr/bin/env perl
use strict ;
use warnings ;
use Mojo::DOM ;
my $html = <<EOHTML;
<!DOCTYPE html>
<html>
<head>
<title>Sample HTML with 2 divs</title>
</head>
<body>
<div>
Four score and seven years ago our fathers brought forth on this
continent a new nation, conceived in liberty, and dedicated to the
proposition that all men are created equal.
</div>
<div>
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
</div>
</body>
</html>
EOHTML
my $dom = Mojo::DOM->new ;
$dom->parse( $html ) ;
for my $div ( $dom->find( 'div' )->each ) {
print $div->all_text . "\n" ;
}
输出结果为:
Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.