Perl在两个文件和一个文件中查找不匹配的字段

时间:2016-03-04 12:00:57

标签: regex perl

我有两个txt文件:

fileA.txt(制表符作为分隔符)

field1  field2
A       value1
A       value2
A       value3
B       value112
C       value33
D       value11
E       value3
B       value23
E       value5

fileB.txt(制表符作为分隔符)

field1  field2
A       value1
A       value3
M       value9
B       value5
C       value33

我希望脚本报告:

  1. 在fileB.txt中,如果两个field1具有不同的field2 报告:值1和A值3
  2. 在fileB.txt中,如果field2与fileA.txt中的field2的值不同,则对应于同一个field1 报告:B值5
  3. 因此脚本的输出应该是:

    A       value1
    A       value3
    B       value5
    

    我的脚本要完成#2:

    #!/usr/bin/perl -w
    
    my $fileA="/tmp/fileA.txt";
    my $fileB="/tmp/fileB.txt";
    my @orifields;
    my @tmp;
    my @unmatched;
    
    open(FHD, "$fileB") || die "Unable to open $fileB: $!\n";
    @orifields = <FHD>;
    close(FHD);
    chomp(@orifields);
    
    open(FHD, "$fileA") || die "Unable to open $fileA: $!\n";
    @tmp = <FHD>;
    close(FHD);
    chomp(@tmp);
    foreach my $line (@tmp){
       print("Each line in fileA: $line\n");
    }
    
    foreach my $line (@orifields) {
       my ($field1, $field2) = split(/\t/, $line);
       print("Field1 is: $field1, Field2 is: $field2\n");
    
       if (! grep(/$line/, @tmp)) {
          if (grep(/$field1/,@tmp)) {
             push(@unmatched,"$line");
          }
       }
    }
    
    print("Unmatched: @unmatched\n");
    

    有没有很好的方法在脚本中实现这两种方法而不重复变量?提前谢谢,

1 个答案:

答案 0 :(得分:2)

使用哈希来记住文件的内容:

#! /usr/bin/perl
use warnings;
use strict;

my %hash_a;
open my $FA, '<', 'fileA.txt' or die $!;
while (<$FA>) {
    chomp;
    my ($f1, $f2) = split /\t/;
    undef $hash_a{$f1}{$f2};
}


my %hash_b;
open my $FB, '<', 'fileB.txt' or die $!;
while (<$FB>) {
    chomp;
    my ($f1, $f2) = split /\t/;
    push @{ $hash_b{$f1} }, $f2;

    if (exists $hash_a{$f1} && ! exists $hash_a{$f1}{$f2}) {
        print "#2: $f1 $f2\n";
    }
}

for my $key (grep @{ $hash_b{$_} } > 1, keys %hash_b) {
    print join(' ', "#1: $key", @{ $hash_b{$key} }), "\n";
}