如何使用Perl的Mojo :: DOM从文本中提取iframe

时间:2013-04-30 19:58:37

标签: perl mojo

我有this text,当我这样做时:

print STDERR (Mojo::DOM->new($args->{'body'})->at('iframe')); 

输出:

<iframe allowfullscreen="" frameborder="0" height="360" scrolling="no" 
src="http://localhost:8000/embed/static/clips/2012/12/17/28210/test-rush" width="480">
</iframe>`

只是打印身体中的第一个iframe。为什么不打印其他iframe,我可以将所有iframe放在数组中吗?

1 个答案:

答案 0 :(得分:4)

根据Mojo::Dom文件。 at函数只查找匹配的第一个元素。所以它应该只返回1.我认为find就是你所追求的,因为它返回一个匹配的集合

use strict;
use warnings;

use Mojo::DOM;

my $dom = Mojo::DOM->new();
while (<DATA>) {
    $dom->append_content($_);
}

#print $dom;

print $dom->find('iframe');

__DATA__
<p>No one's telling the truth anymore, and that makes the numbers suspect.</p>
<p><iframe width="480" height="360" src="http://localhost:8000/embed/static/clips/2012/12/17/28210/test-rush" allowfullscreen="" frameborder="0" scrolling="no"></iframe></p>
<p>Instead of addressing the fact that some text</p>
<p><iframe width="480" height="360" src="http://localhost:8000/embed//static/video/2012/09/07/fnc-ff-20120907-doocytaxes" allowfullscreen="" frameborder="0" scrolling="\&quot;no\&quot;"></iframe></p>
<p>The very first example AP cites was already corrected.some text ....Reacting to recent <a href="/blog/2013/04/17/major-errors-undermine-key-argument-for-austeri">research</a> that has questions.</p>
<p><iframe width="480" height="360" src="http://localhost:8000/embed/static/clips/2013/04/29/29939/fnc-an-20130429-hemmermooredebtgdp" allowfullscreen="" frameborder="0" scrolling="no"></iframe></p>
<p>&nbsp;Arriving at such a conclusion requires not only obscuring the importance in pushing global austerity <a href="/static/images/item/gdp-components.jpg">strong measures</a> of too little spending.</p>

打印您的iframe:

<iframe allowfullscreen="" frameborder="0" height="360" scrolling="no" src="http://localhost:8000/embed/static/clips/2012/12/17/28210/test-rush" width="480"></iframe> <iframe allowfullscreen="" frameborder="0" height="360" scrolling="\&quot;no\&quot;" src="http://localhost:8000/embed//static/video/2012/09/07/fnc-ff-20120907-doocytaxes" width="480"></iframe> <iframe allowfullscreen="" frameborder="0" height="360" scrolling="no" src="http://localhost:8000/embed/static/clips/2013/04/29/29939/fnc-an-20130429-hemmermooredebtgdp" width="480"></iframe>

编辑:

  1. 您可以使用each Mojo::Collection函数迭代每个iframe:

    my $ collection = $ dom-&gt; find('iframe');

    $collection->each(sub {
        my ($e, $count) = @_;
        print "$count: $e\n"; # Or do something besides print. 
     });
    
  2. 你可以在数组中添加一个@循环:

    foreach (@$collection) {
       print "\n Next Elt.:", $_->{src}, ",\n"; #still access elements of iframe with ->
    }