Question

我有一个关于模式匹配问题的非常简单的perl问题。我正在读取带有名称列表（fileA）的文件。我想检查这些名称是否存在于另一个文件（fileB）中。

if ($name -e $fileB){
    do something
}else{
    do something else
}

它可以检查文件中是否存在模式。我试过了

open(IN, $controls) or die "Can't open the control file\n";
    while(my $line = <IN>){
            if ($name =~ $line ){
                    print "$name\tfound\n";
            }else{
                    print "$name\tnotFound\n";
            }
    }

这会重复，因为它会检查并打印每个条目，而不是检查名称是否存在。

Answer 1

要检查文件中是否存在模式，您必须打开该文件并阅读其内容。搜索包含两个列表的最快方法是将内容存储在哈希中：

#!/usr/bin/perl
use strict;
use warnings;

open my $LST, '<', 'fileA' or die "fileA: $!\n";
open my $FB,  '<', 'fileB' or die "fileB: $!\n";

my %hash;
while (<$FB>) {
    chomp;
    undef $hash{$_};
}

while (<$LST>) {
    chomp;
    if (exists $hash{$_}) {
        print "$_ exists in fileB.\n";
    }
}

Answer 2

当您将一个列表与另一个列表进行比较时，您对哈希感兴趣。哈希是一个键控的数组，列表本身没有顺序。哈希只能包含特定键的单个实例（但不同的键可以具有相同的数据）。

您可以做的是浏览第一个文件，并创建一个由该行键入的哈希值。然后，您浏览第二个文件夹并检查这些行中的任何一行是否与您的哈希中的任何键匹配：

#! /usr/bin/env perl

use strict;
use warnings;
use feature qw(say);
use autodie;  #You don't have to check if "open" fails.

use constant {
    FIRST_FILE   => 'file1.txt',
    SECOND_FILE  => 'file2.txt',
};
open my $first_fh, "<", FIRST_FILE;

# Get each line as a hash key
my %line_hash;
while ( my $line = <$first_fh> ) {
    chomp $line;
    $line_hash{$line} = 1;
}
close $first_fh;

现在每一行都是哈希%line_hash中的一个键。数据真的没关系。关键是钥匙本身的价值。

现在我在第一个文件中有我的哈希行，我可以在第二个文件中读取并查看我的哈希中是否存在该行：

open my $second_fh, "<", SECOND_FILE;
while ( my $line = <$second_fh> ) {
    chomp $line;
    if ( exists $line_hash{$line} ) {
        say qq(I found "$line" in both files);
    }
}
close $second_fh;

还可以使用map函数：

#! /usr/bin/env perl

use strict;
use warnings;
use feature qw(say);
use autodie;  #You don't have to check if "open" fails.

use constant {
    FIRST_FILE   => 'file1.txt',
    SECOND_FILE  => 'file2.txt',
};
open my $first_fh, "<", FIRST_FILE
chomp ( my @lines = <$first_fh> );

# Get each line as a hash key
my %line_hash = map { $_ => 1 } @lines;
close $first_fh;

open my $second_fh, "<", SECOND_FILE;
while ( my $line = <$second_fh> ) {
    chomp $line;
    if ( exists $line_hash{$line} ) {
        say qq(I found "$line" in both files);
    }
}
close $second_fh;

我不是map的忠实粉丝，因为我觉得它没那么高效，而且更难理解发生了什么。

Answer 3

我刚刚给出了一种未经测试的算法代码。但我觉得这对你有用。

my @a;
my $matched
my $line;
open(A,"fileA");
open(A,"fileB");
while(<A>)
{
    chomp;
    push @a,$_;
}
while(<B>)
{
    chomp;
    $line=$_;
    $matched=0;
    for(@a){if($line=~/$_/){last;$matched=1}}
    if($matched)
    {
        do something
    }
    else
    {
        do something else
    }
}

检查文件中是否存在模式

3 个答案: