Question

我一直在尝试比较两个文件之间的行和匹配相同的行。

出于某种原因，下面的代码只会经过'text1.txt'的第一行并打印'if'语句，无论这两个变量是否匹配。

由于

use strict;
open( <FILE1>, "<text1.txt" );
open( <FILE2>, "<text2.txt" );
foreach my $first_file (<FILE1>) {
    foreach my $second_file (<FILE2>) {
        if ( $second_file == $first_file ) {
            print "Got a match - $second_file + $first_file";
        }
    }
}
close(FILE1);
close(FILE2);

Answer 1

如果您比较字符串，请使用eq运算符。 "=="以数字方式比较参数。

Answer 2

更好，更快（但内存效率更低）的方法是将一个文件读入哈希，然后在哈希表中搜索行。这样，您只需查看一次文件。

# This will find matching lines in two files,
# print the matching line and it's line number in each file.

use strict;

open (FILE1, "<text1.txt") or die "can't open file text1.txt\n";
my %file_1_hash;
my $line;
my $line_counter = 0;

#read the 1st file into a hash 
while ($line=<FILE1>){
  chomp ($line); #-only if you want to get rid of 'endl' sign
  $line_counter++;
  if (!($line =~ m/^\s*$/)){
    $file_1_hash{$line}=$line_counter;
  }
}
close (FILE1);

#read and compare the second file
open (FILE2,"<text2.txt") or die "can't open file text2.txt\n";
$line_counter = 0;
while ($line=<FILE2>){
  $line_counter++;
  chomp ($line);
  if (defined $file_1_hash{$line}){
    print "Got a match: \"$line\"
in line #$line_counter in text2.txt and line #$file_1_hash{$line} at text1.txt\n";
  }
}
close (FILE2);

Answer 3

如果您的文件不是太大，这是一种完成工作的方法。

#!/usr/bin/perl
use Modern::Perl;
use File::Slurp qw(slurp);
use Array::Utils qw(:all);
use Data::Dumper;

# read entire files into arrays
my @file1 = slurp('file1');
my @file2 = slurp('file2');

# get the common lines from the 2 files
my @intersect = intersect(@file1, @file2);

say Dumper \@intersect;

Answer 4

您必须重新打开或重置文件的指针2.将open和close命令移至循环内。

根据文件和行大小，更有效的方法是仅循环访问文件一次，并将文件1中出现的每一行保存在散列中。然后检查文件2中每行的行是否存在。

Answer 5

如果你想要行数，

my $count=`grep -f [FILE1PATH] -c [FILE2PATH]`;

如果你想要匹配的行，

my @lines=`grep -f [FILE1PATH]  [FILE2PATH]`;

如果您想要不匹配的行，

my @lines = `grep -f [FILE1PATH] -v [FILE2PATH]`;

Answer 6

这是我编写的一个脚本，试图查看两个文件是否相同，尽管可以通过播放代码并将其切换到eq来轻松修改。正如蒂姆建议的那样，使用散列可能会更有效，尽管你不能确保按照插入的顺序比较文件而不使用CPAN模块（正如你所看到的，这个方法应该真正使用两个循环，但它足以满足我的目的）。这不是有史以来最伟大的剧本，但它可能会让你在某个地方开始。


use warnings;

open (FILE, "orig.txt") or die "Unable to open first file.\n";
@data1 = ;
close(FILE);

open (FILE, "2.txt") or die "Unable to open second file.\n";
@data2 = ;
close(FILE);

for($i = 0; $i < @data1; $i++){
    $data1[$i] =~ s/\s+$//;
    $data2[$i] =~ s/\s+$//;
    if ($data1[$i] ne $data2[$i]){
        print "Failure to match at line ". ($i + 1) . "\n";
        print $data1[$i];
        print "Doesn't match:\n";
        print $data2[$i];
        print "\nProgram Aborted!\n";
        exit;
    }
}

print "\nThe files are identical. \n";

Answer 7

获取您发布的代码，并将其转换为实际的Perl代码，这就是我想出来的。

use strict;
use warnings;
use autodie;

open my $fh1, '<', 'text1.txt';
open my $fh2, '<', 'text2.txt';

while(
  defined( my $line1 = <$fh1> )
  and
  defined( my $line2 = <$fh2> )
){
  chomp $line1;
  chomp $line2;

  if( $line1 eq $line2 ){
    print "Got a match - $line1\n";
  }else{
    print "Lines don't match $line1 $line2"
  }
}

close $fh1;
close $fh2;

现在您真正想要的是两个文件的差异，最好留给Text::Diff。

use strict;
use warnings;

use Text::Diff;

print diff 'text1.txt', 'text2.txt';

使用perl比较文件中的行

7 个答案: