Question

我有一些HTML文件，我希望从每个文件中提取两个表。是否可以在一次扫描中从两个表中提取？

列标题稍有不同，这个脚本有效，但看起来有点长，有什么方法我可以有'Schedule Name |节点名称'作为最后一列的标题，并一次性获取两个表。 tabes是深度/计数2.1和2.2。

<code>
  #!/usr/bin/perl
use strict;
use warnings;
#use diagnostics;
use HTML::TableExtract;
use Text::Table;

##my $sched = qr/Schedule Name|Node Name/;
my $html = "c:\\Testin.htm";
my $out = "c:\\Testout.csv";
open( my $ofh, ">", $out ) or die "oops" ;
 my  $headers =  [ 'Status', 'Results', 'Schedule Name'];
my $table_extract = HTML::TableExtract->new(headers => $headers);
my $table_output = Text::Table->new();
$table_extract->parse_file($html);
my ($table) = $table_extract->tables or die "no emails to process\n";

foreach  my $row ($table->rows) {
       $table_output->load($row);
     print "   ", join(',',grep defined, @$row), "\n";
print $ofh "   ", join(',',grep defined, @$row ), "\n";
}
   $headers =  [ 'Status', 'Results', 'Node Name'];
 $table_extract = HTML::TableExtract->new(headers => $headers);
 $table_output = Text::Table->new();

$table_extract->parse_file($html);
 ($table) = $table_extract->tables;

foreach my $row ($table->rows) {
       $table_output->load($row);
     print "   ", join(',',grep defined, @$row),"\n";
print $ofh "   ", join(',',grep defined, @$row), "\n";
}

<code>

Answer 1

你不能说出你的意思＆＃34;名称＆＃34; （HTML <table>元素不能有{{1} }属性）但如果两个表＆＃39;标题如代码所示，您只需编写

即可

name

如果包含 my $table_extract = HTML::TableExtract->new(headers => [qw/ status Results Name /])数组中的任何字符串，则标题将匹配。这也是一个不区分大小写的匹配。

使用Perl按表名提取多个HTML表

1 个答案: