我想从具有以下结构的网站打印一些日期:
<tr><td><b><a href="/calendar.*?=\w+">(.*?)</a></b></td>
<td align=".*?"/date/(\d+)-(\d+)/">.*?</a> <a href="/year/\d+/">(\d+)</a></td>
<td>(.*?)*</td></tr>
等
my $country = $1;
my $month = $2;
my $day = $3;
my $year = $4;
my $event = $5;
我只需要提取$country
是'USA'的那些,但如果我使用while
语句,代码会在第一场比赛中无休止地循环。如何重新编写脚本以提取每个找到的美国日期?
sub getSpec {
my $line = shift;
my $site = getSite($line);
while ($site =~ s/.../) {
my $country = $1;
my $month = $2;
my $day = $3;
my $year = $4;
my $event = $5;
if ($country =~ /USA/i) {
print $month.$date.$year.$country.$event."\n";
}
}
}
答案 0 :(得分:1)
答案 1 :(得分:0)
看起来你在第一场比赛后没有改变字符串。尝试逐行阅读$ site(它是整个网站的html,对吗?),所以循环看起来像这样 (我的Perl有点生锈,这只是粗略的草图,对不起):
while ( $_ = another_line_from_$site)
{
if($_ =~ s/.../) {
{variables}
if($country =~ /USA/i)
{ other_stuff }
}
}