遇到else条件(无效的url)后,循环终止,不再处理其他URL。 2.即使节点在xpath中失败,它也不会在屏幕或文件中打印。我想在文件和屏幕中打印(节点异常)
use LWP::Simple;
use File::Compare;
use HTML::TreeBuilder::XPath;
use LWP::UserAgent;
use Win32::Console::ANSI;
use Term::ANSIColor;
sub crawl_content{
{
open(FILE, "C:/Users/jeyakuma/Desktop/input/input.txt");
{
while(<FILE>){
chomp;
$url=$_;
foreach ($url){
$domain) = $url =~ m|www.([A-Z a-z 0-9]+.{3}).|x;
}
do 'C:/Users/jeyakuma/Desktop/perl/mainsub.pl';
&domain_check();
my $ua = LWP::UserAgent->new( agent => "Mozilla/5.0" );
my $req = HTTP::Request->new( GET => "$url" );
my $res = $ua->request($req);
if ( $res->is_success ){
print "working on $domain\n";
binmode ":utf8";
my $xp = HTML::TreeBuilder::XPath->new_from_url($url);
my @node = $xp->findnodes_as_string("$xpath") or print "couldn't find the node\n" ;
open HTML, '>:encoding(cp1252)',"C:/Users/jeyakuma/Desktop/ project/data_$date/$site.html";
foreach(<@node>){
print HTML @node;
close HTML ;
}
}
else{
print color("green"), "$domain Invalid url\n", color("reset") and open FILE,">C:/Users/jeyakuma/Desktop/log.txt"; print FILE " $domain Invalid URL";
}
}
}
}
}
do 'C:/Users/jeyakuma/Desktop/perl/comparefinal.pl';
compare_result();
}
答案 0 :(得分:2)
else
条件重新打开FILE
以写入另一个文件。因此,在while (<FILE>)
循环的下一次迭代中,Perl将尝试从FILE
读取并失败(因为它现在只能用于写入而不是读取),并且循环将结束。您需要在FILE
条件中使用else
以外的名称。