使用HTML :: TreeBuilder在html文件中查找值

时间:2012-09-27 07:22:24

标签: perl

以下是html文件中的数据。我想使用“HTML :: TreeBuilder”

在html文件中找到值
<table id="stats" cellpadding="0" cellspacing="0">
<tbody>
    <tr class="row-even">
        <td class="stats_left">Main Domain</td>
        <td class="stats_right"><b>myabcab.com</b></td>
    </tr>
    <tr class="row-odd">
        <td class="stats_left">Home Directory</td>
        <td class="stats_right">/home/abc</td>
    </tr>
    <tr class="row-even">
        <td class="stats_left">Last login from</td>
        <td class="stats_right">22.32.232.223&nbsp;</td>
    </tr>
    <tr class="row-odd">
        <td class="stats_left">Disk Space Usage</td>
        <td class="stats_right">30.2 / &#8734; MB<br>
        <div class="stats_progress_bar">
        <div class="cpanel_widget_progress_bar" title="0%"
            style="position: relative; width: 100%; height: 100%; padding: 0px; margin: 0px; border: 0px">
        </div>
        <div class="cpanel_widget_progress_bar_percent" style="display: none">0</div>
        </div>
        </td>
    </tr>
    <tr class="row-even">
        <td class="stats_left">Monthly Bandwidth Transfer</td>
        <td class="stats_right">0 / &#8734; MB<br>
        <div class="stats_progress_bar">
        <div class="cpanel_widget_progress_bar" title="0%"
            style="position: relative; width: 100%; height: 100%; padding: 0px; margin: 0px; border: 0px">
        </div>
        <div class="cpanel_widget_progress_bar_percent" style="display: none">0</div>
        </div>
        </td>
    </tr>
</tbody>
  </table>

如何使用“HTML :: TreeBuilder”找到“磁盘使用空间”值。我上面的代码中有许多与相同类的tds,

1 个答案:

答案 0 :(得分:4)

找到包含匹配内容的<td>,在本例中为“磁盘空间使用情况”,然后找到下一个<td>

获得元素树后:

my $usage = $t->look_down(
    _tag => 'td',
    sub {
        $_[0]->as_trimmed_text() =~ /^Disk Space Usage$/
    }
)->right()->as_trimmed_text();

如果look_down找不到匹配项,您可能希望将其包装在eval块中。

HTML :: Element中的树导航方法是有效有效使用HTML :: TreeBuilder的关键部分。


Mohini问道,“为什么这不起作用?”

(我添加的格式)

use strict;
use warnings;
use HTML::TreeBuilder;

my $tree = HTML::TreeBuilder->new_from_file( "index.html");
my $disk_value; my $disk_space;

for ( $tree->look_down( _tag => q{tr}, 'class' => 'row-odd' ) ) {

    $disk_space = $tree->look_down(
         _tag => q{td},
         'class' => 'stats_left'
    )->as_trimmed_text;

    if ( $disk_space eq 'Home Directory' ) {
        $disk_value = $tree->look_down( _tag => q{td}, 'class' => 'stats_right' )
                           ->right()
                           ->as_trimmed_text();
    }

}

print STDERR "my home value is $disk_space : $disk_value\n";

look_down从您调用它的根节点开始,向下看元素树(这些树上下颠倒)并返回匹配节点列表或第一个匹配节点,具体取决于上下文。 / p>

由于所有要向下看的调用都在树上,因此每次循环都会重复找到相同的节点。

你的循环应该看起来像这样:

my %table_stuff;

for my $odd_row ( $tree->look_down( _tag => q{tr}, 'class' => 'row-odd' ) ) {

    $heading = $odd_row->look_down(
         _tag => q{td},
         'class' => 'stats_left'
    );

    $table_stuff{ $heading->as_trimmed_text() } = $heading->right()->as_trimmed_text();
}

这会使用表格元素填充哈希值。

如果您只想要一个值,请不要使用循环。 look_down已经充当了循环。

my $heading = $t->look_down(
    _tag => 'td',
    sub {
        $_[0]->as_trimmed_text() =~ /^Home Directory$/
    }
);

my $value = $heading->right();

#  Now $heading and $value have HTML::Element nodes that you can do whatever you want with.

my $disk_value = $value->as_trimmed_text();
my $disk_space = $heading->as_trimmed_text();