将值打印到文件和终端

时间:2014-10-27 19:33:04

标签: perl parsing printing

这里超级棒。 试图让$cssurl打印到文件与终端,但只打印一个值打印在文件与终端打印的所有内容。我如何修改下面的代码以获得我需要的东西?

以下是代码:

use lib '/Users/lialin/perl5/lib/perl5';
use strict;
use warnings;
use feature 'say';
use File::Slurp 'slurp';    # makes it easy to read files.
use Mojo;
use Mojo::UserAgent;
use URI;

my $calls_dir = "Ask/";
opendir( my $search_dir, $calls_dir ) or die "$!\n";
my @html_files = grep /\.html$/i, readdir $search_dir;
closedir $search_dir;
#print "Got ", scalar @files, " files\n";

foreach my $html_files (@html_files) {
    my %seen         = ();
    my $current_file = $calls_dir . $html_files;
    open my $FILE, '<', $current_file or die "$html_files: $!\n";

    my $dom = Mojo::DOM->new( scalar slurp $calls_dir . $html_files );
    print $calls_dir . $html_files;

    for my $csshref ( $dom->find('a[href]')->attr('href')->each ) {
        my $cssurl = URI->new($csshref)->abs( $calls_dir . $html_files );

        open my $fh, '>', "Ask/${html_files}.result.txt" or die $!;
        $fh->print("$html_files\n");
        $fh->print("$cssurl\n");
        #$fh->print("\t"."$_\n");
        print "$cssurl\n";
        #print $file."\t"."$_\n";
    }
}

在终端我得到了这个:

http://www.scigene.com/
about 500 of other urls in here that stack overflow won't let me post
http://feedback.ask.com

写到我得到的文件:

Agilent_Technologies_ask.html
http://feedback.ask.com

所以我得到最后一行。

1 个答案:

答案 0 :(得分:0)

您的问题出现是因为您多次重新打开同一个文件并在每次打开时覆盖内容。如果您在逻辑上考虑它,您希望为每个您解析的输入文件创建一个输出文件,因此最好在打开输入文件时创建输出文件:

my $dom = Mojo::DOM->new( scalar slurp $calls_dir . $html_files );
open my $fh, '>', "Ask/${html_files}.result.txt" or die $!;

如果有任何材料只需要打印一次(文件标题等),则需要在开始循环浏览URL之前完成。

您的for循环现在看起来像这样:

foreach my $html_files (@html_files) {

    my $dom = Mojo::DOM->new( scalar slurp $calls_dir . $html_files );
    print $calls_dir . $html_files;

    open my $fh, '>', "Ask/${html_files}.result.txt" or die $!;
    $fh->print("$html_files\n");

    for my $csshref ( $dom->find('a[href]')->attr('href')->each ) {
        my $cssurl = URI->new($csshref)->abs( $calls_dir . $html_files );

        $fh->print("$cssurl\n");
        print "$cssurl\n";
    }
}