Question

我使用以下Perl脚本搜索多个文件，并在匹配该行中的特定数字时打印出整个文本行：

#!/perl/bin/perl

use strict;
use warnings;

my @files = <c:/perl64/myfiles/*>;

foreach my $file (@files) {
  open my $file_h, '<', $file
   or die "Can't open $file: $!";

  while (<$file_h>) {
print "$file $_" if /\b1203\b/; 
print "$file $_" if /\b1204\b/;
print "$file $_" if /\b1207\b/;
  } }

每次在一个或多个文件中的一行上存在数字时，该脚本可以很好地匹配和打印。我的问题是，我希望能够确定何时在任何文件中根本不匹配该数字。

我们正在解析具有数千行的多个文件，因此要查找delta（即任何文件中此编号的NO MATCH）非常耗时。

为了澄清，每次在每个文件中匹配数字时，我仍然需要匹配和打印，而不仅仅是匹配一次。它所匹配的线路输出也是打印的关键。

最终这只是为了显示数字是否与任何文件中的任何位置都不匹配。

为了便于阅读而编辑的来源

#!/perl/bin/perl

use strict;
use warnings;

my @files = <c:/perl64/myfiles/*>;

foreach my $file ( @files ) {

    open my $file_h, '<', $file or die "Can't open $file: $!";

    while ( <$file_h> ) {

        print "$file $_" if /\b1203\b/;
        print "$file $_" if /\b1204\b/;
        print "$file $_" if /\b1207\b/;
    }
}

Answer 1

我希望能够确定任何档案中该号码根本没有匹配

由于你要浏览多个文件，你需要记住你曾经看过一个特定的数字。计数哈希在这里非常有用，也是解决这类问题的常用方法。

同时将数字（或模式）移动到数组中是有意义的。这样，您只需在代码中列出一次，整体代码就会变得不那么混乱。

my @numbers = (1203, 1204, 1205);
my %seen;
foreach my $file (@files) {
    # ...
    while (<$file_h>) {
        foreach my $number (@numbers) {
            if (/\b$number\b/) {
                print "$file $_"; 
                $seen{$number} = 1; # we don't care how many, just that we saw it
             }
        }
    }
}

# At this point, %seen contains a key for every number that was seen at least once.
# If a number was not seen, it will not have a key.

# output numbers that were not seen
foreach my $number (@numbers) {
    print "no match: $_\n" unless exists $seen{$number};
}

从文件中查找并打印完全匹配，如果找不到匹配则通知打印

为了便于阅读而编辑的来源

1 个答案: