Question

我试图只选择目录中的.log文件，然后在这些文件中搜索单词＆＃34; unbound＆＃34;并将整行打印到一个新的输出文件中，该文件的名称与日志文件（number###.log）相同，但扩展名为.txt。这就是我到目前为止所做的：

#!/usr/bin/perl

  use strict;
  use warnings;

  my $path = $ARGV[0];
  my $outpath = $ARGV[1];
  my @files;
  my $files;

  opendir(DIR,$path) or die "$!";
  @files = grep { /\.log$/} readdir(DIR);


  my @out;
  my $out;
  opendir(OUT,$outpath) or die "$!";

  my $line;
  foreach $files (@files) {
  open (FILE, "$files");
  my @line = <FILE>;
  my $regex = Unbound;
  open (OUT, ">>$out");
  print grep {$line =~ /$regex/ } <>;
   } 
  close OUT;
  close FILE;

  closedir(DIR);
  closedir (OUT);

我是初学者，我真的不知道如何使用获得的输出创建新的文本文件。

Answer 1

我建议改进此代码的几件事情：

在循环中声明循环迭代器。 foreach my $file ( @files ) {
使用3 arg open：open ( my $input_fh, "<", $filename );
使用glob而不是opendir然后使用grep。 foreach my $file ( <$path/*.txt> ) {
grep适合将事物提取到数组中。您的grep会读取整个文件进行打印，这是不必要的。如果文件很短，那就不重要了。
perltidy非常适合重新格式化代码。
你正在打开＆＃39; OUT＆＃39;到目录路径（我认为？），它不会起作用。
$outpath不是，它是一个文件。您需要执行不同的操作才能输出到不同的文件。 opendir对输出无效。
因为您正在使用opendir实际上为您提供文件名 - 而不是完整路径。所以你可能在错误的地方实际打开文件。在预先设置路径名称时，执行chdir是可能的解决方案。但这是我喜欢glob的原因之一，因为它也会返回路径。

所以考虑到这一点 - 如何：

#!/usr/bin/perl

use strict;
use warnings;
use File::Basename;

#Extract paths
my $input_path  = $ARGV[0];
my $output_path = $ARGV[1];

#Error if paths are invalid. 
unless (defined $input_path
    and -d $input_path
    and defined $output_path
    and -d $output_path )
{
    die "Usage: $0 <input_path> <output_path>\n";
}

foreach my $filename (<$input_path/*.log>) {

   # extract the 'name' bit of the filename. 
   # be slightly careful with this - it's based 
   # on an assumption which isn't always true. 
   # File::Spec is a more powerful way of accomplishing this.
   # but should grab 'number####' from /path/to/file/number####.log
   my $output_file = basename ( $filename, '.log' );

   #open input and output filehandles. 
   open( my $input_fh, "<", $filename ) or die $!;
   open( my $output_fh, ">", "$output_path/$output_file.txt" ) or die $!;

   print "Processing $filename -> $output_path/$output_file.txt\n";

   #iterate input, extracting into $line
   while ( my $line = <$input_fh> ) {
        #check if $line matches your RE. 
        if ( $line =~ m/Unbound/ ) {
            #write it to output. 
            print {$output_fh} $line;
        }
   }
   #tidy up our filehandles. Although technically, they'll 
   #close automatically because they leave scope
   close($output_fh);
   close($input_fh);
}

Answer 2

这是一个利用Path::Tiny的脚本。现在，在学习过程的这个阶段，您最好不要理解@Sobrique's solution，但使用Path::Tiny或Path::Class等模块可以更快地更快地编写这些脚本，并且正确。

另外，我没有真正测试过这个脚本，所以请注意bug。

#!/usr/bin/env perl

use strict;
use warnings;

use Path::Tiny;

run(\@ARGV);

sub run {
    my $argv = shift;
    unless (@$argv == 2) {
        die "Need source and destination paths\n";
    }
    my $it = path($argv->[0])->realpath->iterator({
        recurse => 0,
        follow_symlinks => 0,
    });
    my $outdir = path($argv->[1])->realpath;

    while (my $path = $it->()) {
        next unless -f $path;
        next unless $path =~ /[.]log\z/;

        my $logfh = $path->openr;
        my $outfile = $outdir->child($path->basename('.log') . '.txt');
        my $outfh;

        while (my $line = <$logfh>) {
            next unless $line =~ /Unbound/;
            unless ($outfh) {
                $outfh = $outfile->openw;
            }
            print $outfh $line;
        }
        close $outfh
            or die "Cannot close output '$outfile': $!";
    }
}

注释

realpath会嘶哑。
同样适用于openr和openw。
我正在逐行读取输入文件，以保持程序的内存占用量与输入文件的大小无关。
在我知道要打印到匹配项之前，我不打开输出文件。
使用正则表达式模式匹配文件扩展名时，请记住\n是Unix文件名中的有效字符，而$ anchor 将匹配它。

在Perl中，如何过滤目录中的所有日志文件，并提取有趣的行？

2 个答案:

注释